WO2024127313A1

WO2024127313A1 - Metrics calculation and visualization in digital oral care

Info

Publication number: WO2024127313A1
Application number: PCT/IB2023/062707
Authority: WO
Inventors: Michael Starr; Jonathan D. Gandrud
Original assignee: 3M Innovative Properties Co
Current assignee: 3M Innovative Properties Co
Priority date: 2022-12-14
Filing date: 2023-12-14
Publication date: 2024-06-20
Anticipated expiration: 2025-06-14
Also published as: EP4634931A1

Abstract

Systems and techniques are disclosed for visualizing oral care metrics. The method involves receiving one or more three-dimensional (3D) oral care representations and utilizing processing circuitry to compute the oral care metrics based on these representations. The processing circuitry further converts the computed oral care metrics into a multidimensional format. To enhance visualization, the dimensionality of the oral care metrics in the multidimensional format is reduced, forming a reduced- dimensionality version of the metrics. Finally, the processing circuitry renders the reduced-dimensionality version of the oral care metrics in a visualized form. These systems and techniques enable the effective visualization of oral care metrics, providing valuable insights and facilitating improved analysis and decision-making in oral care applications.

Description

METRICS CALCULATION AND VISUALIZATION IN DIGITAL ORAL CARE Related Documents [0001] The entire disclosure of PCT Application No. PCT/IB2022/057373 is incorporated herein by reference. The entire disclosure of each of PCT Applications with Publication Nos. WO2022123402A1, WO2021245480A1, and WO2020026117A1 are incorporated herein by reference. The entire disclosure of each of the following Provisional U.S. Patent Applications is incorporated herein by reference: 63/432,627; 63/366,492; 63/366,495; 63/352,850; 63/366,490; 63/366,494; 63/370,160; 63/366,507; 63/352,877; 63/366,514; 63/366,498; 63/366,514; and 63/264,914. Technical Field [0002] This disclosure relates to digital solutions that address the area of oral care. Summary [0003] The present disclosure describes systems and techniques for generating visualizations of one or more oral care metrics, which may incorporate dimensionality reduction. The rendered oral care metric(s) may provide the technical improvements of data precision enhancement and improved efficiency when used for generating designs for or directly fabricating one or more oral care appliances (or components thereof). This disclosure also describes metrics for use in generating restoration tooth designs for use in dental restoration treatment. Some metrics pertain to an individual tooth. Some metrics pertain to two or more teeth (such as symmetrical pairs of teeth or adjacent teeth). These metrics may enable the calculation of scores for teeth which are to undergo dental restoration. These metrics may enable the rating of arches which are to undergo dental restoration. These metrics may enable the operation of optimization algorithms for the generation of dental restoration designs (e.g., for the generation of dental restoration appliances). These metrics may be visualized using the visualization techniques described herein. Orthodontic metrics may also be visualized using the techniques described herein. [0004] Methods of this disclosure may generate one or more target shapes for a 3D oral care representation (e.g., a tooth, an appliance component, a fixture model component, etc.). A digital 3D model of a patient’s tooth in a pre-restoration state (or another 3D oral care representation in a pre- modification state) may be provided to the method. A scoring function using one or more dental restoration metrics related to the tooth in the pre-restoration state (or another 3D oral care representation in a pre-modification state) may be provided as input to the method, to generate a score associated with the tooth in the pre-restoration state (or another 3D oral care representation in a pre-modification state). At least one of a structure or a shape of the tooth (or another 3D oral care representation) may be modified to form modified aspects of the tooth (or another 3D oral care representation). At least one of the structure or the shape of the tooth (or another 3D oral care representation) may be updated based on the score and/or the modified aspects of the tooth (or another 3D oral care representation) to generate one or more post-restoration states of the tooth (or another 3D oral care representation) after implementation of a dental restoration treatment on the tooth (or another 3D oral care representation). Position information associated with the tooth from the 3D digital model may be provided to the method. Landmark information associated with the tooth from the 3D digital model may be provided to the method. An example scoring function may be defined by the formula: ^^^^^(^) = ∑ ^_^^^^^^^ ^_^^ ^_^^_^(^_^) , where X represents a vector of metrics function that computes an error or penalty given a value of a metric

metric, and w_i is a weight associated with the penalty for the metric. A scoring function may be represented by f(P(x)), where f is a function selected from one or more of linear, non-linear, and a probabilistic framework function, among others. The methods may modify at least one of the structure or the shape of the tooth (or other 3D oral care representation) , such as modifying one of the position or orientation of a mesh element, within the digital 3D model of the tooth (or other 3D oral care representation) to generate the modified state of a tooth (or other 3D oral care representation). A mesh element comprises at least one of a point, a vertex, an edge, a face or a voxel. The methods may apply a smoothing operation to one or more mesh elements. The methods may remove one or more mesh elements from the tooth (or other 3D oral care representation). The methods may add one or more mesh elements to the tooth (or other 3D oral care representation). The methods may generate one or more post-restoration states of the tooth (or other post-modification 3D oral care representation), which may be used to generate a design for a dental restoration appliance (or other oral care appliance). [0005] Methods of this disclosure may visualize oral care metrics. One or more three-dimensional (3D) oral care representations (e.g., teeth of the patient, appliance component, fixture model component, etc.) may be provided to a method for computing and visualizing oral care metrics. The methods may compute one oral care metrics based on the one or more 3D oral care representations (e.g., teeth, among others described herein). The one or more of the oral care metrics (e.g., dozens, hundreds or thousands of metrics) may be saved to a vector, which may then undergo dimensionality reduction (e.g., using U-Map or t-SNE), to form a reduced-dimensionality version of the one or more oral care metrics. The reduced- dimensionality version of the one or more oral care metrics may be visualized (e.g., in a low-dimensional plot, such as in 2D or 3D). One or more oral care metrics may be rendered in the visualized form to a system configured to construct an oral care appliance, a fixture model, or a component of the oral care appliance using the at least one oral care metric rendered in the visualized form. The oral care appliance may comprise at least one of a dental restoration appliance or an orthodontic appliance. The one or more oral care metrics may quantify a geometrical relationship between at least two teeth in a respective 3D oral care representation of the one or more 3D oral care representations. The one or more oral care metrics may quantify at least one of the structure or a shape of an individual tooth in a respective 3D oral care representation of the one or more 3D oral care representations. The reduced-dimensionality metrics may be rendered, for example, on a graphical user interface (GUI) element that indicates a progression from a maloccluded setup to a final setup associated with orthodontic treatment plan for a patient. The method may be used to validate a setup series that includes the maloccluded setup, the final setup, or one or more intermediate stages associated with the orthodontic treatment plan for the patient. An orthodontic treatment plan for the patient may involve a clear tray aligner (CTA). The method may be executed in a clinical context. [0006] Methods of this disclosure may be used for visualizing oral care metrics. The methods may receive three-dimensional representations of oral care data (e.g., teeth, dental arches, etc.) and compute oral care metrics based on these representations. The oral care metrics may be stored into a high- dimensional vector (e.g., containing hundreds, thousands or tens of thousands of dimensions), and the dimensionality of the high-dimensional vector can be reduced to generate a reduced-dimensionality version or tuple. The tuple of the oral care metrics can be rendered in a visualized form (e.g., plotted as points in X and Y axes). [0007] In one aspect, the methods include transmitting the visualized oral care metrics to a system that constructs a fixture model, a fixture model component, an oral care appliance, or a component of an oral care appliance using the metrics. The oral care appliance can be a dental restoration appliance, an indirect bonding tray, or an orthodontic appliance, among other examples. [0008] The oral care metrics may quantify various aspects of 3D representations of oral care data (e.g., teeth, dental arches, etc.), such as geometrical relationships between two or more teeth, the layouts of teeth, or the structure and/or shape of individual teeth. The methods may also involve rendering one or more graphical user interface elements (see FIG.4) that indicate the progression from a maloccluded setup to a final setup associated with an orthodontic treatment plan. The fitness of one or more setups can be validated. Orthodontic treatment may involve the use of a clear tray aligner (CTA) or an indirect bonding tray, among other treatments. [0009] The methods may utilize dimensionality reduction techniques such as Stochastic Neighbor Embedding (SNE), t-Distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection for Dimension Reduction (U-Map), Locally Linear Embedding (LLE), Symmetric SNE, Isomap, or Sammon Mapping, among others. [0010] In another aspect, the methods may involve clustering the reduced-dimensionality versions of oral care metrics using unsupervised clustering techniques. Histograms can be generated based on the supervised clustering (e.g., based on the clusters). [0011] A computing device can visualize the oral care metrics. The computing device may receive oral care metrics, means for converting and reducing their dimensionality, and means for rendering the oral care metrics in a visualized form. The device can transmit the visualized oral care metrics to a system for constructing an oral care appliance. The oral care metrics can quantify geometrical relationships between teeth or the structure and shape of individual teeth. The device can also render a graphical user interface element indicating the progression of an orthodontic treatment plan. [0012] The methods can be deployed at a clinical context to assist in visualizing and analyzing oral care metrics. Brief Description of Drawings [0013] FIG.1 shows a method of augmenting training data for use in training machine learning (ML) models of this disclosure. [0005] FIG.2 shows the improvement in an Archform Alignment oral care metric over the stages of treatment. [0006] FIG.3 shows a plot of orthodontic metrics (for UR1) which have undergone dimensionality reduction. [0007] FIG.4 shows a plot of orthodontic metrics (for the full arch) which have undergone dimensionality reduction. [0008] FIG.5 shows the progress in orthodontic treatment of a patient case. [0009] FIG.6 shows a tooth-arch alignment oral care metric. [0010] FIG.7 shows the progress scores for the orthodontic treatment of a patient case. [0011] FIG.8 shows a buccolingual inclination oral care metric. [0012] FIG.9 shows a midline oral care metric. [0013] FIG.10 shows an overjet oral care metric. [0014] FIG.11 shows a method of computing restoration design metrics based upon 3D representations of the patient’s dentition. [0015] FIG.12 shows a method of using one or more trained ML models to generate oral care metrics based upon 3D representations of the patient’s dentition. Detailed Description [0014] The machine learning techniques described herein may receive a variety of input data, as described herein, including tooth meshes for one or both dental arches of the patient. The tooth data may be presented in the form of 3D representations, such as meshes or point clouds. These data may be preprocessed, for example, by arranging the constituent mesh elements into lists and computing an optional mesh element feature vector for each mesh element. Such vectors may impart valuable information about the shape and/or structure of an oral care mesh to the machine learning models described herein. Additional inputs may be received as the input to the machine learning models described herein, such as one or more oral care metrics. Oral care metrics may be used for measuring one or more physical aspects of an oral care mesh (e.g., physical relationships within a tooth or between different teeth). In some instances, an oral care metric may be computed for either or both of a malocclusion oral care mesh example and/or a ground truth oral care mesh example which is then used in the training of the machine learning models described herein. The metric value may be received as the input to the machine learning models described herein, as a way of training that model or those models to encode a distribution of such a metric over the several examples of the training dataset. During training, the network may then receive metric value(s) as input, to assist in training the network to link that inputted metric value to the physical aspects of the ground truth oral care mesh which is used in loss calculation. Such a loss calculation may quantify the difference between a prediction and a ground truth example (e.g., between a predicted oral care mesh and a ground truth oral care mesh). By providing the network with metric value(s), the network techniques of this disclosure may, through the course of loss calculation and subsequent backpropagation, learn train the network to encode a distribution of a given metric. In deployment, one or more oral care parameters (procedure parameters or restoration design parameters) may be defined to specify one or more aspects of an intended oral care mesh, which is to be generated using the machine learning models described herein, which have been trained for that purpose. In some implementations, an oral care parameter may be defined which corresponding to an oral care metric may be defined, which may be received as input to the machine learning models described herein and be taken as an instruction to that module to generate an oral care mesh with the specified customization. This interplay between oral care metrics and oral care parameters may also apply to the training and deployment of other predictive models in oral care as well. [0015] The predictive models of the present disclosure may, in some implementations, may produce more accurate results by the incorporation of one or more of the following inputs: archform information V, interproximal reduction (IPR) information U, tooth dimension information P, tooth gap information Q, latent capsule representations of oral care meshes T, latent vector representations of oral care meshes A, procedure parameters K (which may describe a clinician’s intended treatment of the patient), doctor preferences L (which may describe the typical procedure parameters chosen by a doctor), flags regarding tooth status M (such as for fixed or pinned teeth), tooth position information N, tooth orientation information O, tooth name/dental notation R, oral care metrics S (comprising of at least one of oral care metrics and restoration design metrics). [0016] Systems of this disclosure may, in some instances, be deployed at a clinical context (such as a dental or orthodontic office) for use by clinicians (e.g., doctors, dentists, orthodontists, nurses, hygienists, oral care technicians). Such systems which are deployed at a clinical context may enable clinicians to process oral care data (such as dental scans) in the clinic environment, or in some instances, in a "chairside" context (where the patient is present in the clinical environment). A non-limiting list of examples of techniques may include: segmentation, mesh cleanup, coordinate system prediction, CTA trimline generation, restoration design generation, appliance component generation or placement or assembly, generation of other oral care meshes, the validation of oral care meshes, setups prediction, removal of hardware from tooth meshes, hardware placement on teeth, imputation of missing values, clustering on oral care data, oral care mesh classification, setups comparison, metrics calculation, or metrics visualization. The execution of these techniques may, in some instances, enable patient data to be processed, analyzed, and used in appliance creation by the clinician before the patient leaves the clinical environment (which may facilitate treatment planning because feedback may be received from the patient during the treatment planning process). [0017] Systems of this disclosure may automate operations in digital orthodontics (e.g., setups prediction, hardware placement, setups comparison), in digital dentistry (e.g., restoration design generation) or in combinations thereof. Some techniques may apply to either or both of digital orthodontics and digital dentistry. A non-limiting list of examples is as follows: segmentation, mesh cleanup, coordinate system prediction, oral care mesh validation, imputation of oral care parameters, oral care mesh generation or modification (e.g., using autoencoders, transformers, continuous normalizing flows or denoising diffusion models), metrics visualization, appliance component placement or appliance component generation or the like. In some instances, systems of this disclosure may enable a clinician or technician to process oral care data (such as scanned dental arches). In addition to segmentation, mesh cleanup, coordinate system prediction or validation operations, the systems of this disclosure may enable orthodontic treatment planning, which may involve setups prediction as at least one operation. Systems of this disclosure may also enable restoration design generation, where one or more restored tooth designs are generated and processed in the course of creating oral care appliances. Systems of this disclosure may enable either or both of orthodontic or dental treatment planning, or may enable automation steps in the generation of either or both of orthodontic or dental appliances. Some appliances may enable both of dental and orthodontic treatment, while other appliances may enable one or the other. [0018] Aspects of the present disclosure can provide a technical solution to the technical problem of computing oral care metrics (e.g., orthodontic metrics or restoration design metrics) using either mesh processing techniques or machine learning techniques). In particular, by practicing techniques disclosed herein computing systems specifically adapted to compute oral care metrics for use in oral care appliance generation (e.g., oral care metrics which may be provided to a setups prediction model or a restoration design generation model) are improved. For example, aspects of the present disclosure improve the performance of a computing system for computing oral care metrics by reducing the consumption of computing resources. In particular, aspects of the present disclosure reduce computing resource consumption by decimating 3D representations of the patient’s dentition (e.g., reducing the counts of mesh elements used to describe aspects of the patient’s dentition) so that computing resources are not unnecessarily wasted by processing excess quantities of mesh elements. Additionally, decimating the meshes does not reduce the overall predictive accuracy of the computing system (and indeed may actually improve predictions because the input provided to the ML model after decimation is a more accurate (or better) representation of the patient’s dentition). For example, noise or other artifacts which are unimportant (and which may reduce the accuracy of the predictive models) are removed. That is, aspects of the present disclosure provide for more efficient allocation of computing resources and in a way that improves the accuracy of the underlying system. [0019] Furthermore, aspects of the present disclosure may need to be executed in a time-constrained manner, such as when an oral care appliance must be generated for a patient immediately after intraoral scanning (e.g., while the patient waits in the clinician’s office). As such, aspects of the present disclosure are necessarily rooted in the underlying computer technology of oral care metrics generation and cannot be performed by a human, even with the aid of pen and paper. For instance, implementations of the present disclosure must be capable of: 1) storing thousands or millions of mesh elements of the patient’s dentition in a manner that can be processed by a computer processor; 2) performing calculation on thousands or millions of mesh elements, e.g., to compute hundreds or thousands of distinct oral care metrics for each patient case; 3) reducing the dimensionality of those oral care metrics and 4) computing, based on mesh processing techniques or machine learning models, one or more oral care metrics (e.g., for use in generating an oral care appliance), and do so during the course of a short office visit. [0020] Representation learning techniques (e.g., as shown in FIG.12) may be used to train machine learning models (e.g., using cohort patient case data 1200) to generate oral care metrics 1226 (e.g., restoration design metrics or orthodontic metrics). Cohort patient case data 1200 and 1218 for each patient may contain 3D representations of the patient’s teeth (or gums or other aspects of the dentition). The tooth data may be accompanied by transforms which may be applied to place the teeth into setups poses (e.g., mal, staging or final setups). A first ML module 1202 and 1220 (e.g., which may contain one or more U-Nets, one or more encoders, one or more reconstruction autoencoders, one or more transformer encoders, one or more pyramid encoder-decoders, or others described herein) may be trained to generate latent representations 1212 and 1222 for one or more of the patient’s teeth (or representations of other aspects of the patient’s dentition). The one or more 3D representations of the patient’s teeth (or other aspects of dentition) may contain one or more mesh elements (as described herein). The first ML module 1202 may, in some implementations, use a mesh element feature module 1204 (as described elsewhere in this specification) to compute mesh element feature vectors for one or more of the mesh elements. These mesh element features may assist the first ML module 1202 in encoding the distributions of the shapes and/or structures of the patient’s dentition, resulting in latent representations that are more accurate (e.g., latent representations which may, in some implementations, be reconstructed into more accurate facsimiles of the original teeth or gums). The representations (e.g., latent representations or latent forms) may be described using a combination of information-rich and/or reduced-dimensionality versions of the patient’s dentition (e.g., teeth). The first ML module 1202 shown in FIG.12 contains a reconstruction autoencoder which uses encoder E11206 to encode the patient’s dentition 1200 into a latent form 1208, which may then be reconstructed by decoder D11210, into reconstructed representations of the dentition 1214. The latent representation 1208 may be outputted (1212). The outputted latent representation 1212 (or 1222) may be provided to a second ML module 1224 (e.g., a generative module) which may be trained to generate one or more oral care metric values (e.g., a vector of oral care metrics values) based on aspects of one or more of the patient’s teeth. [0021] The ML models of this disclosure (e.g., FIG.12) may be trained, at least in part, through the use of loss functions. Loss may be computed to quantify the difference between predicted outputs 1226 and corresponding ground truth outputs (e.g., which may be provided as a part of the training dataset of patient case data). Cross-entropy, or other loss calculation methods described herein may be used to compute such losses (e.g., losses between vectors of oral care metrics). [0022] In some implementations, the second ML module 1224 may be trained to generate oral care metrics for a single tooth (e.g., describing the shape and/or structure of that tooth). In other implementations, the second ML module 1224 may be trained to generation oral care metrics for two or more teeth (e.g., describing the spatial relationships between the two or more teeth). The second ML module 1224 may comprise one or more multi-layer perceptrons (e.g., containing one or more fully connected layers), one or more encoders, one or more transformer decoders, one or more transformer encoders, or the like. Such generated oral care metrics may be outputted for use by other techniques of this disclosure (e.g., the generation of oral care appliances – such as orthodontic aligners or dental restoration appliances). The first ML model or the second ML model may be trained, at least in part, through the calculation of one or more loss values (e.g., L1 loss, L2 loss, mean squared error (MSE) loss, cross entropy loss, or others disclosed herein). Such a loss value may be computed by comparing one or more predicted values for one or more oral care metrics to corresponding ground truth or reference values for the one or more oral care metrics, for a particular example from the training dataset. In some implementations, oral care arguments 1216 may be provided to the second ML module 1224, to customize the generative predictions of the second ML module 1224. [0023] For example, a training dataset of cohort patient cases may be provided to the first ML module 1220 and the second ML module 1224 for training. Each case may contain, for example, a set of segmented teeth, a set of malocclusion transforms (which place respective teeth into maloccluded poses), or a set of ground truth (or reference) setups transforms (e.g., final setup transforms). One or more oral care metric values may be provided along with each case (e.g., “Alignment”, “Overbite”, or “Curve-of- spee”, etc.). Alternatively, the oral care metrics for the patient’s dentition may be computed by an Oral Care Metrics module (e.g., which may compute oral care metrics as defined herein – such as using mesh processing techniques) which may compute oral care metrics according to the descriptions found herein. The set of oral care metrics may be provided (or computed) for the mal setup, and the set of oral care metrics may be provided (or computed) for the ground truth final setup. The mal setup may be provided to the first ML module, which may generate the latent representations (e.g., as described elsewhere in this specification) for each mal tooth, called the mal latent representations. The ground truth final setup may be provided to the first ML module, which may generate latent representations (e.g., as described elsewhere in this specification) for each ground truth final setup tooth, called the ground truth latent representations. In some implementations, the first ML module has a U-Net which has been trained to generate latent representations of the teeth which contain hierarchical neural network features. In some implementations, the second ML module contains a set of four fully connected layers, with optional skip connections. [0024] The mal latent representations may be concatenated (or otherwise combined) and provided to the second ML module, which may generate one or more oral care metrics values for the teeth of the malocclusion. For example, the second ML module may generate a vector containing values for the “Alignment”, “Overbite”, or “Curve-of-spee” orthodontic metrics. The generated (or predicted) oral care metrics values may be compared to the corresponding metrics values which were originally computed (or provided) for the teeth of the malocclusion. A loss may be computed as a result of the comparison (e.g., an L1, L2, MSE, or cross-entropy loss, among others described herein), and subsequently be used to train (e.g., using backpropagation), at least in part, either or both of the first ML module and the second ML module. [0025] Likewise, the ground truth final setup latent representations may be concatenated (or otherwise combined) and provided to the second ML module, which may generate one or more oral care metrics values for the teeth of the ground truth final setup. For example, the second ML module may generate a vector containing values for the “Alignment”, “Overbite”, or “Curve-of-spee” orthodontic metrics. The generated (or predicted) oral care metrics values may be compared to the corresponding metrics values which were originally computed (or provided) for the teeth of the ground truth final setup. A loss may be computed as a result of the comparison (e.g., an L1, L2, MSE, or cross-entropy loss, among others described herein), and subsequently be used to train (e.g., using backpropagation), at least in part, either or both of the first ML module and the second ML module. In other implementations, the second ML module may be trained to generate other sets of one or more of the oral care metrics described herein. [0026] Aspects of the present disclosure can provide a technical solution to the technical problem of computing oral care metrics (e.g., orthodontic metrics or restoration design metrics) using either mesh processing techniques or machine learning techniques). In particular, by practicing techniques disclosed herein computing systems specifically adapted to compute oral care metrics for use in oral care appliance generation (e.g., oral care metrics which may be provided to a setups prediction model or a restoration design generation model) are improved. For example, aspects of the present disclosure improve the performance of a computing system for computing oral care metrics by reducing the consumption of computing resources. In particular, aspects of the present disclosure reduce computing resource consumption by decimating 3D representations of the patient’s dentition (e.g., reducing the counts of mesh elements used to describe aspects of the patient’s dentition) so that computing resources are not unnecessarily wasted by processing excess quantities of mesh elements. Additionally, decimating the meshes does not reduce the overall predictive accuracy of the computing system (and indeed may actually improve predictions because the input provided to the ML model after decimation is a more accurate (or better) representation of the patient’s dentition). For example, noise or other artifacts which are unimportant (and which may reduce the accuracy of the predictive models) are removed. That is, aspects of the present disclosure provide for more efficient allocation of computing resources and in a way that improves the accuracy of the underlying system. [0027] Furthermore, aspects of the present disclosure may need to be executed in a time-constrained manner, such as when an oral care appliance must be generated for a patient immediately after intraoral scanning (e.g., while the patient waits in the clinician’s office). As such, aspects of the present disclosure are necessarily rooted in the underlying computer technology of oral care metrics generation and cannot be performed by a human, even with the aid of pen and paper. For instance, implementations of the present disclosure must be capable of: 1) storing thousands or millions of mesh elements of the patient’s dentition in a manner that can be processed by a computer processor; 2) performing calculation on thousands or millions of mesh elements, e.g., to compute hundreds or thousands of distinct oral care metrics for each patient case; 3) reducing the dimensionality of those oral care metrics and 4) computing, based on mesh processing techniques or machine learning models, one or more oral care metrics (e.g., for use in generating an oral care appliance), and do so during the course of a short office visit. [0028] Aspects of the present disclosure can provide a technical solution to the technical problem of visualizing, using one or more dimensionality reduction techniques, one or more oral care metrics. In particular, by practicing techniques disclosed herein computing systems specifically adapted to visualize and/or analyze configurations of orthodontic setups for oral care appliance generation are improved. For example, aspects of the present disclosure improve the performance of a computing system for visualizing oral care metrics data by reducing the consumption of computing resources. In particular, aspects of the present disclosure reduce computing resource consumption by reducing the dimensionality of high-dimensional vectors of oral care metrics (e.g., reducing hundreds or thousands of dimensions to 2 or 3 dimensions which can be easily plotted and visualized) so that computing resources are not unnecessarily wasted by visualizing large quantities of oral care metrics plots. [0029] Furthermore, aspects of the present disclosure may need to be executed in a time-constrained manner, such as when an oral care appliance must be generated for a patient immediately after intraoral scanning (e.g., while the patient waits in the clinician’s office). As such, aspects of the present disclosure are necessarily rooted in the underlying computer technology of oral care metrics dimensionality reduction and visualization and cannot be performed by a human, even with the aid of pen and paper. For instance, implementations of the present disclosure must be capable of: 1) storing thousands or millions of mesh elements of the patient’s dentition in a manner that can be processed by a computer processor; 2) performing calculation on thousands or millions of mesh elements, e.g., to to compute hundreds or thousands of distinct oral care metrics for each patient case; 3) reducing the dimensionality of those oral care metrics and 4) visualizing, based on one or more dimensionality reduction techniques, one or more oral care metrics so that treatment planning can be performed (e.g., planning of 10-50 intermediate stages of orthodontic treatment), and do so during the course of a short office visit. [0030] This disclosure pertains to digital oral care, which encompasses the fields of digital dentistry and digital orthodontics. This disclosure generally describes methods of processing three-dimensional (3D) representations of oral care data. It should be understood, without loss of generality, that there are various types of 3D representations. One type of 3D representation is a 3D geometry. A 3D representation may include, be, or be part of one or more of a 3D polygon mesh, a 3D point cloud (e.g., such as derived from a 3D mesh), a 3D voxelized representation (e.g., a collection of voxels – for sparse processing), or 3D representations which are described by mathematical equations. Although the term “mesh” is used frequently throughout this disclosure, the term should be understood, in some implementations, to be interchangeable with other types of 3D representations. A 3D representation may describe elements of the 3D geometry and/or 3D structure of an object. [0031] Dental arches S1, S2, S3 and S4 all contain the exact same tooth meshes, but those tooth meshes are transformed differently, according to the following description. A first arch S1 includes a set of tooth meshes arranged (e.g., using transforms) in their positions in the mouth, where the teeth are in the mal positions and orientations. A second arch S2 includes the same set of tooth meshes from S1 arranged (e.g., using transforms) in their positions in the mouth, where the teeth are in the ground truth setup positions and orientations. A third arch S3 includes the same meshes as S1 and S2, which are arranged (e.g., using transforms) in their positions in the mouth, where the teeth are in the predicted final setup poses (e.g., as predicted by one or more of the techniques of this disclosure). S4 is a counterpart to S3, where the teeth are in the poses corresponding to one of the several intermediate stages of orthodontic treatment with clear tray aligners. [0032] It should be understood, without the loss of generality, that the techniques of this disclosure which apply to final setups are also applicable to intermediate staging in orthodontic treatment, particularly geometric deep learning (GDL) Setups, reinforcement learning (RL) Setups, variational autoencoder (VAE) Setups, Capsule Setups, multilayer perceptron (MLP) Setups, Diffusion Setups, pose transfer (PT) Setups, Similarity Setups, force directed graphs (FDG) Setups, Transformer Setups, Setups Comparison, or Setups Classification. The Metrics Visualization aspects of this disclosure may also be configured to visualize data from both final setups and intermediate stages. MLP Setups, VAE Setups and Capsule Setups each fall within the scope of Autoencoder Setups. Some implementations of MLP Setups may fall within the Scope of Transformer Setups. Representation Setups refers to any of MLP Setups, VAE Setups, Capsule Setups and any other setups prediction machine learning model which uses an autoencoder to create the representation for at least one tooth. [0033] Each of the setups prediction techniques of this disclosure is applicable to the fabrication of clear tray aligners and indirect bonding trays. The setups predictions techniques may also be applicable to other products that involve final teeth poses, also. A pose may comprise a position (or location) and a rotation (or orientation). [0034] A 3D mesh is a data structure which may describe the geometry or shape of an object related to oral care, including but not limited to a tooth, a hardware element, or a patient’s gum tissue. A 3D mesh may include one or more mesh elements such as one or more of vertices, edges, faces and combinations thereof. In some implementations, mesh element may include voxels, such as in the context of sparse mesh processing operations. Various spatial and structural features may be computed for these mesh elements and be provided to the predictive models of this disclosure, with the predictive models of this disclosure providing the technical advantage of improving data precision in the form of the models of this disclosure outputting more accurate predictions. [0035] A patient’s dentition may include one or more 3D representations of the patient’s teeth (e.g., and/or associated transforms), gums and/or other oral anatomy. An orthodontic metric (OM) may, in some implementations, quantify the relative positions and/or orientations of at least one 3D representation of a tooth relative to at least one other 3D representation of a tooth. A restoration design metric (RDM) may, in some implementations, quantify at least one aspect of the structure and/or shape of a 3D representation of a tooth. An orthodontic landmark (OL) may, in some implementations, locate one or more points or other structural regions of interest on a 3D representation of a tooth. An OL may, in some implementations, be used in the generation of an orthodontic or dental appliance, such as a clear tray aligner or a dental restoration appliance. A mesh element may, in some implementations, comprise at least one constituent element of a 3D representation of oral care. For example, in the case of a tooth that is represented by a 3D mesh, mesh elements may include at least: vertices, edges, faces and voxels. A mesh element feature may, in some implementations, quantify some aspect of a 3D representation in proximity to or in relation with one or more mesh elements, as described elsewhere in this disclosure. Orthodontic procedure parameters (OPP) may, in some implementations, specify at least one value which defines at least one aspect of planned orthodontic treatment for the patient (e.g., specifying desired target attributes of a final setup in final setups prediction). Orthodontic Doctor preferences (ODP) may, in some implementations, specify at least one typical value for an OPP, which may, in some instances, be derived from past cases which have been treated by one or more oral care practitioners. Restoration Design Parameters (RDP) may, in some implementations, specify at least one value which defines at least one aspect of planned dental restoration treatment for the patient (e.g., specifying desired target attributes of a tooth which is to undergo treatment with a dental restoration appliance). Doctor Restoration Design Preferences (DRDP) may, in some implementations, specify at least one typical value for an RDP, which may, in some instances, be derived from past cases which have been treated by one or more oral care practitioners. 3D oral care representations may include, but are not limited to: 1) a set of mesh element labels which may be applied to the 3D mesh elements of teeth/gums/hardware/appliance meshes (or point clouds) in the course of mesh segmentation or mesh cleanup; 2) 3D representation(s) for one or more teeth/gums/hardware/appliances for which shapes have been modified (e.g., trimmed, distorted, or filled-in) in the course of mesh segmentation or mesh cleanup; 3) one or more coordinate systems (e.g., describing one, two, three or more coordinate axes) for a single tooth or a group of teeth (such as a full arch – as with the LDE coordinate system); 4) 3D representation(s) for one or more teeth for which shapes have been modified or otherwise made suitable for use in dental restoration; 5) 3D representation(s) for one or more dental restoration appliance components; 6) one or more transforms to be applied to one or more of: dental restoration appliance library component placement relative to one or more teeth, a tooth to be placed for an orthodontic setup (either final setup or intermediate stage), a hardware element to be placed relative to one or more teeth or the like; 7) an orthodontic setup; 8) a 3D representation of a hardware element (such as facial bracket, lingual bracket, orthodontic attachment, button, hook, bite ramp, etc.) to be placed relative to one or more teeth, etc.; 8) a 3D representation of a bonding pad for a hardware element (which may be generated for a specific tooth by outlining a perimeter on the tooth, specifying a thickness to form a shell, and then subtracting-out the tooth via a Boolean operation); 9) 3D representation of a clear tray aligner (CTA); 10) the location or shape of a CTA trimline (e.g., described as either a mesh or polyline); 11) archform that describes the contours or layout of an arch of teeth (e.g., described as a 3D polyline or as a 3D mesh or surface), which may follow the incisal edges one or more teeth, which may follow the facial surfaces of one or more teeth, which may in some implementations correspond to the maloccluded arch and in other implementations correspond to the final setup arch (the effects of malocclusion on the shape of the archform may be diminished by smoothing or averaging of the shape of the archform), which may be described by one or more control points and/or a spline; 12) 3D representation of a fixture models (e.g., depictions of teeth and gums for use in thermoforming clear tray aligners, or depictions of teeth/gums/hardware for use in thermoforming indirect bonding trays); 13) one or more latent space vectors (or latent capsules) produced by the 3D encoder stage of a 3D autoencoder which has been trained on the reconstruction of oral care meshes (e.g., a variational autoencoder which has been trained for tooth reconstruction); 14) one or more oral care metrics values (e.g., such as orthodontic metrics or restoration design generation metrics) for one or more teeth; 15) one or more landmarks (e.g., 3D points) which describe the shapes and/or geometrical attributes of one or more teeth, other dentition structures or hardware structures (e.g., to be used in orthodontic setups creation or restoration appliance component generation or placement); 16) 3D representation created by scanning (e.g., optically scanning, CT scanning or MRI scanning) a 3D printed part corresponding to one or more teeth/gums/hardware/appliances (e.g., a scanned fixture model); 17) 3D printed aligners (including optionally local thickness, reinforcing rib geometry, flap positioning, or the like) 18) 3D representation of the patient's dentition that was captured chairside by a clinician or medical practitioner (e.g., in a context where the 3D representation is validated chairside, before the patient leaves the clinic, so that errors can be detected and re-scans performed as necessary); 19) dental restoration tooth design (e.g., for veneers, crowns, bridges or dental restoration appliances); 20) 3D representations of one or more teeth for use in digital oral care treatment; 21) other 3D printed parts pertaining to oral care procedures or other fields; 22) IPR cut surfaces; 23) one or more orthodontic setups transforms associated with one or more IPR cut surfaces; 24) a (digital) pontic tooth design which may fill at least a portion of the space between teeth to allow room in an orthodontic setup for an erupting tooth to later emerge from the gums; or 25) a component of a fixture model (e.g., comprising fixture model components such as interproximal webbing, block-out, bite locks, bite ramps, interproximal reinforcement, gingival ridges, torque points, power ridges, pontic tooth or dimples, among others). [0036] A 3D representation of oral care data may comprise one or more teeth of the patient, one or more dental arches, one or more orthodontic setups, one or more appliance components (e.g., dental restoration appliance components - such as parting surfaces or doors and windows, etc.), one or more fixture model components (e.g., interproximal webbing, digital pontic teeth, etc.), or other 3D representations pertaining to digital oral care treatment disclosed herein. [0037] Oral care metrics may quantify aspects of the shape, structure, poses or layout of the patient's dentition. In some implementations, the patient's dentition may comprise 3D representations of one or more segmented teeth and/or 3D representations of the gums. Some oral care metrics (e.g., orthodontic metrics) may quantify aspects of the poses or layouts of two or more teeth. Some oral care metrics (e.g., restoration design metrics) may either quantify aspects of the poses or layouts of two or more teeth, or aspects of the shape or structure of an individual tooth. [0038] Dimensionality reduction methods contemplated within the scope of this disclosure include: SNE Stochastic Neighbor Embedding (SNE), t-Distributed Stochastic Neighbor Embedding (t-SNE), Uniform manifold approximation and projection for dimension reduction (U-Map), Locally Linear Embedding (LLE), symmetric SNE, Isomap, Sammon mapping, among others. In some implementations, techniques of this disclosure may convert a vector of oral care metrics values into a lower dimensionality format (e.g., a tuple with two or three dimensions). This reduced-dimensionality tuple may be plotted for the purpose of visualization. FIG.4 is an example of such a visualization, in the case where a U-Map was used for dimensionality reduction. In other implementations, t-SNE may be used for dimensionality reduction. In some implementations, the plurality of reduced-dimensionality tuples may be provided to an unsupervised clustering model, which may generate one or more clusters based, at least in part, on the plurality of reduced-dimensionality tuples. Unsupervised clustering methods which may be used include: K-Means, Mini-Batch K-Means, DBSCAN, Mean Shift, OPTICS, Spectral Clustering, Agglomerative Clustering, among others. Tuples which appear in the same cluster may have similar oral care metrics values, and consequently have similar shapes, structures, poses or layouts. Stated another way, when the tuples for two setups appear near each other in the t-SNE dimensionality-reduced space, those setups may be considered to be similar. In some implementations, one or more histograms may be generated for the setups within the same cluster. The histograms may be outputted for inspection, to provide clinicians with and improved understanding of the dentitions of cohort patients. The histograms may be used in treatment planning (e.g., for orthodontic treatment). [0039] The techniques of this disclosure may be advantageously combined. For example, the Setups Comparison tool may be used to compare the output of the GDL Setups model against ground truth data, compare the output of the RL Setups model against ground truth data, compare the output of the VAE Setups model against ground truth data and compare the output of the MLP Setups model against ground truth data. With each of these setups prediction models compared against ground truth data, it may be possible to determine which model gives the best performance on a certain dataset or within a given problem domain. Furthermore, the Metrics Visualization tool can enable a global view of the final setups and intermediate stages produced by one or more of the setups prediction models, with the advantage of enabling the selection of the best setups prediction model. The Metrics Visualization tool, furthermore, enables the computation of metrics which have a global scope over a set of intermediate stages. These global metrics may, in some implementations, be consumed as inputs to the neural networks for predicting setups (e.g., GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups, Diffusion Setups, PT Setups, Similarity Setups, among others). The global metrics may also be provided to FDG Setups. The local metrics from this disclosure (i.e., a local metric is a metric which may be computed for one stage or setup of treatment, rather than over several stages or setups) may, in some implementations, be consumed by the neural networks herein for predicting setups, with the advantage of improving predictive results. The metrics described in this disclosure may, in some implementations, be visualized using the Metric Visualization tool. [0040] The VAE and MAE models for mesh element labelling and mesh in-filling can be advantageously combined with the setups prediction neural networks, for the purpose of mesh cleanup ahead of or during the prediction process. In some implementations, the VAE for mesh element labelling may be used to flag mesh elements for further processing, such as metrics calculation, removal or modification. In some instances, such flagged mesh elements may be provided as inputs to a setups prediction neural network, to inform that neural network about important mesh features, attributes or geometries, with the advantage of improving the performance of the resulting setups prediction model. In some implementations, mesh in-filling may cause the geometry of a tooth to become more nearly complete, enabling the better functioning of a setups prediction model (i.e., improved correctness of prediction on account of better-formed geometry). In some instances, a neural network to classify a setup (i.e., the Setups Classifier) may aid in the functioning of a setups prediction neural network, because the setups classifier tells that setups prediction neural network when the predicted setup is acceptable for use and can be provided to a method for aligner tray generation. A Setups Classifier (e.g., GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups, Diffusion Setups, PT Setups, Similarity Setups and FDG Setups, among others) may aid in the generation of final setups and also in the generation of intermediate stages. Furthermore, a Setups Classifier neural network may be combined with the Metrics Visualization tool. In other implementations, a Setups Classification neural network may be combined with the Setups Comparison tool (e.g., the Setup Comparison tool may output an indication of how a setup produced in part by the Setups Classifier compares to a setup produced by another setups prediction method). In some implementations, the VAE for mesh element labelling may identify one or more mesh elements for use in a metrics calculation. The resulting metrics outputs may be visualized by the Metrics Visualization tool. [0041] In some examples, the Setups Classifier neural network may aid in the setups prediction technique described in U.S. Patent Application No. US20210259808A1 (which is incorporated herein by reference in its entirety) or the setups prediction technique described in PCT Application with Publication No. WO2021245480A1 (which is incorporated herein by reference in its entirety) or in PCT Application No. PCT/IB2022/057373 (which is incorporated herein by reference in its entirety). The Setups Classifier would help one or more of those techniques to know when the predicted final setup is most nearly correct. In some instances, the Setups Classifier neural network may output an indication of how far away from final setup a given setup is (i.e., a progress indicator). [0042] In some implementations, the latent space embedding vector(s) from the reconstruction VAE can be concatenated with the inputs to the setups prediction neural network described in WO2021245480A1. The latent space vectors can also be incorporated as inputs to the other setups prediction models: GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups and Diffusion Setups, among others. The advantage is to impart the reconstruction characteristics (e.g., latent vector dimensions of a tooth mesh) to that neural network, hence improving the generated setups prediction. [0043] In some examples, the various setups prediction neural networks of this disclosure may work together to produce the setups required for orthodontic treatment. For example, the GDL Setups model may produce a final setup, and the RL Setups model may use that final setup as input to produce a series of intermediate stages setups. Alternatively, the VAE Setups model (or the MLP Setups model) may create a final setup which may be used by an RL Setups model to produce a series of intermediate stages setups. In some implementations, a setup prediction may be produced by one setups prediction neural network, and then taken as input to another setups prediction neural network for further improvements and adjustments to be made. In some implementations, such improvements may be performed in iterative fashion. [0044] In some implementations, a setups validation model, such as the model disclosed in US Provisional Application No. US63/366495, may be involved in this iterative setups prediction loop. First a setup may be generated (e.g., using a model trained for setups prediction, such as GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups, Diffusion Setups, PT Setups, Similarity Setups and FDG Setups, among others), then the setup undergoes validation. If the setup passes validation, the setup may be outputted for use. If the setup fails validation, the setup may be sent back to one or more of the setups prediction models for corrections, improvements and/or adjustments. In some instances, the setups validation model may output an indication of what is wrong with the setup, enabling the setups generation model to make an improved version upon the next iteration. The process iterates until done. [0045] Generally speaking, in some implementations, two or more of the following techniques of the present disclosure may be combined in the course of orthodontic and/or dental treatment: GDL Setups, Setups Classification, Reinforcement Learning (RL) Setups, Setups Comparison, Autoencoder Setups (VAE Setups or Capsule Setups), VAE Mesh Element Labeling, Masked Autoencoder (MAE) Mesh In- filling, Multi-Layer Perceptron (MLP) Setups, Metrics Visualization, Imputation of Missing Oral Care Parameters Values, Tooth Classification Using Latent Vector, FDG Setups, Pose Transfer Setups, Restoration Design Metrics Calculation, Neural Network Techniques for Dental Restoration and Orthodontics (e.g., 3D Oral Care Representation Generation or Modification Using Transformers), Landmark-based (LB) Setups, Diffusion Setups, Imputation of Tooth Movement Procedures, Capsule Autoencoder Segmentation, Diffusion Segmentation, Similarity Setups, Validation of Oral Care Representations (e.g., using autoencoders), Coordinate System Prediction, Restoration Design Generation or 3D Oral Care Representation Generation or Modification Using Denoising Diffusion Models. [0046] In some instances, tooth shape-based inputs may be provided to a neural network for setups predictions. In other instances, non-shape-based inputs can be used, such as a tooth name or designation, as it pertains to dental notation. In some implementations, a vector R of flags may be provided to the neural network, where a ‘1’ value indicates that the tooth is present and a ‘0’ value indicates that the tooth is absent from the patient case (though other values are possible). The vector R may comprise a 1- hot vector, where each element in the vector corresponds to a tooth type, name or designation. Identifying information about a tooth (e.g., the tooth’s name) can be provided to the predictive neural networks of this disclosure, with the advantage of enabling the neural network to become trained to handle different teeth in tooth-specific ways. For example, the setups prediction model may learn to make setups transformations predictions for a specific tooth designation (e.g., upper right central incisor, or lower left cuspid, etc.). In the case of the mesh cleanup autoencoders (either for labelling mesh element or for in-filling missing mesh data), the autoencoder may be trained to provide specialized treatment to a tooth according to that tooth’s designation, in this manner. In the case of a setups classification neural network, a listing of tooth name(s) present in the patient’s arch may better enable the neural network to output an accurate determination of setup classification, because tooth designation is a valuable input to training such a neural network. Tooth designation/name may be defined, for example, according to the Universal Numbering System, Palmer System, or the FDI World Dental Federation notation (ISO 3950). [0047] In one example, where all except the (up to four) wisdom teeth are present in the case, a vector R may be defined as an optional input to the setups prediction neural networks of this disclosure, where there is a 0 in the vector element corresponding to each of the wisdom teeth, and a 1 in the elements corresponding to the following teeth: UR7, UR6, UR5, UR4, UR3, UR2, UR1, UL1, UL2, UL3, UL4, UL5, UL6, UL7, LL7, LL6, LL5, LL4, LL3, LL2, LL1, LR1, LR2, LR3, LR4, LR5, LR6, LR7 [0048] In some instances, the position of the tooth tip may be provided to a neural network for setups predictions. In other instances, one or more vectors S of the orthodontic metrics described elsewhere in this disclosure may be provided to a neural network for setups predictions. The advantage is an improved capacity for the network to become trained to understand the state of a maloccluded setup and therefore be able to predict a more accurate final setup or intermediate stage. [0049] In some implementations, the neural networks may take as input one or more indications of interproximal reduction (IPR) U, which may indicate the amount of enamel that is to be removed from a tooth during the course orthodontic treatment (either mesially or distally). In some implementations, IPR information (e.g., quantity of IPR that is to be performed on one or more teeth, as measured in millimeters, or one or more binary flags to indicate whether or not IPR is to be performed on each tooth identified by flagging) may be concatenated with a latent vector A which is produced by a VAE or a latent capsule T autoencoder. The vector(s) and/or capsule(s) resulting from such a concatenation may be provided to one or more of the neural networks of the present disclosure, with the technical improvement or added advantage of enabling that predictive neural network to account for IPR. IPR is especially relevant to setups prediction methods, which may determine the positions and poses of teeth at the end of treatment or during one or more stages during treatment. It is important to account for the amount of enamel that is to be removed ahead of predicted tooth movements. [0050] In some implementations, one or more procedure parameters K and/or doctor preferences vectors L may be introduced to a setups prediction model. In some implementations, one or more optional vectors or values of tooth position N (e.g., XYZ coordinates, in either tooth local or global coordinates), tooth orientation O (e.g., pose, such as in transformation matrices or quaternions, Euler angles or other forms described herein), dimensions of teeth P (e.g., length, width, height, circumference, diameter, diagonal measure, volume - any of which dimensions may be normalized in comparison to another tooth or teeth), distance between adjacent teeth Q. These “dimensions of teeth P” may in some instances be used to describe the intended dimensions of a tooth for dental restoration design generation. [0051] In some implementations, tooth dimensions P such as length, width, height, or circumference may be measured inside a plane, such as the plane that intersects the centroid of the tooth, or the plane that intersects a center point that is located midway between the centroid and either the incisal-most extent or the gingival-most extent of the tooth. The tooth dimension of height may be measured as the distance from gums to incisal edge. The tooth dimension of width may be measured as the distance from the mesial extent to the distal extent of the tooth. In some implementations, the circularity or roundness of the tooth cross-section may be measured and included in the vector P. Circularity or roundness may be defined as the ratio of the radii of inscribed and circumscribed circles. [0052] The distance Q between adjacent teeth can be implemented in different ways (and computed using different distance definitions, such as Euclidean or geodesic). In some implementations, a distance Q1 may be measured as an averaged distance between the mesh elements of two adjacent teeth. In some implementations, a distance Q2 may be measured as the distance between the centers or centroids of two adjacent teeth. In some implementations, a distance Q3 may be measured between the mesh elements of closest approach between two adjacent teeth. In some implementations, a distance Q4 may be measured between the cusp tips of two adjacent teeth. Teeth may, in some implementations, be considered adjacent within an arch. Teeth may, in some implementations, also be considered adjacent between opposing arches. In some implementations, any of Q1, Q2, Q3 and Q4 may be divided by a term for the purpose of normalizing the resulting value of Q. In some implementations, the normalizing term may involve one or more of: the volume of a tooth, the count of mesh elements in a tooth, the surface area of a tooth, the cross-sectional area of a tooth (e.g., as projected into the XY plane), or some other term related to tooth size. [0053] Other information about the patient’s dentition or treatment needs (or related parameters) may be concatenated with the other input vectors to one or more of MLP, GAN, generator, encoder structure, decoder structure, transformer, VAE, conditional VAE, regularized VAE, 3D U-Net, capsule autoencoder, diffusion model, and/or any of the neural networks models listed elsewhere in this disclosure. [0054] The vector M may contain flags which apply to one or more teeth. In some implementations, M contains at least one flag for each tooth to indicate whether the tooth is pinned. In some implementations, M contains at least one flag for each tooth to indicate whether the tooth is fixed. In some implementations, M contains at least one flag for each tooth to indicate whether the tooth is pontic. Other and additional flags are possible for teeth, as are combinations of fixed, pinned and pontic flags. A flag that is set to a value that indicates that a tooth should be fixed is a signal to the network that the tooth should not move over the course of treatment. In some implementations, the neural network loss function may be designed to be penalized for any movement in the indicated teeth (and in some particular cases, may be heavily penalized). A flag to indicate that a tooth is pontic informs the network that the tooth gap is to be maintained, although that gap is allowed to move. In some cases, M may contain a flag indicating that a tooth is missing. In some implementations, the presence of one or more fixed teeth in an arch may aid in setups prediction, because the one or more fixed teeth may provide an anchor for the poses of the other teeth in the arch (i.e., may provide a fixed reference for the pose transformations of one or more of the other teeth in the arch). In some implementations, one or more teeth may be intentionally fixed, so as to provide an anchor against which the other teeth may be positioned. In some implementations, a 3D representation (such as a mesh) which corresponds to the gums may be introduced, to provide a reference point against which teeth can be moved. [0055] Without the loss of generality, one or more of the optional input vectors K, L, M, N, O, P, Q, R, S, U and V described elsewhere in this disclosure may also be provided to the input or into an intermediate layer of one or more of the predictive models of this disclosure. In particular, these optional vectors may be provided to the MLP Setups, GDL Setups, RL Setups, VAE Setups, Capsule Setups and/or Diffusion Setups, with the advantage of enabling the respective model to generate setups which better meet the orthodontic treatment needs of the patient. In some implementations, such inputs may be provided, for example, by being concatenated with one or more latent vectors A which are also provided to one or more of the predictive models of this disclosure. In some implementations, such inputs may be introduced, for example, by being concatenated with one or more latent capsules T which are also provided to one or more of the predictive models of this disclosure. [0056] In some implementations, one or more of K, L, M, N, O, P, Q, R, S, U and V may be introduced to the neural network (e.g., MLP or Transformer) directly in a hidden layer of the network. In some instances, one or more of K, L, M, N, O, P, Q, R, S, U and V may be introduced directly into the internal processing of an encoder structure. [0057] In some implementations, a setups prediction model (such as GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups, PT Setups, Similarity Setups and Diffusion Setups) may take as input one or more latent vectors A which correspond to one or more input oral care meshes (e.g., such as tooth meshes). In some implementations, a setups prediction model (such as GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups and Diffusion Setups) may take as input one or more latent capsules T which correspond to one or more input oral care meshes (e.g., such as tooth meshes). In some implementations, a setups prediction method may take as input both of A and T. [0058] Various loss calculation techniques are generally applicable to the techniques of this disclosure (e.g., GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups, Diffusion Setups, PT Setups, Similarity Setups, Setups Classification, Tooth Classification, VAE Mesh Element Labelling, MAE Mesh In-Filling and the imputation of procedure parameters). [0059] These losses include L1 loss, L2 loss, mean squared error (MSE) loss, cross entropy loss, among others. Losses may be computed and used in the training of neural networks, such as multi-layer perceptron’s (MLP), U-Net structures, generators and discriminators (e.g., for GANs), autoencoders, variational autoencoders, regularized autoencoders, masked autoencoders, transformer structures, or the like. Some implementations may use either triplet loss or contrastive loss, for example, in the learning of sequences. [0060] Losses may also be used to train encoder structures and decoder structures. A KL- Divergence loss may be used, at least in part, to train one or more of the neural networks of the present disclosure, such as a mesh reconstruction autoencoder or the generator of GDL Setups, which the advantage of imparting Gaussian behavior to the optimization space. This Gaussian behavior may enable a reconstruction autoencoder to produce a better reconstruction (e.g., when a latent vector representation is modified and that modified latent vector is reconstructed using a decoder, the resulting reconstruction is more likely to be a valid instance of the inputted representation). There are other techniques for computing losses which may be described elsewhere in this disclosure. Such losses may be based on quantifying the difference between two or more 3D representations. [0061] MSE loss calculation may involve the calculation of an average squared distance between two sets, vectors or datasets. MSE may be generally minimized. MSE may be applicable to a regression problem, where the prediction generated by the neural network or other machine learning model may be a real number. In some implementations, a neural network may be equipped with one or more linear activation units on the output to generate an MSE prediction. Mean absolute error (MAE) loss and mean absolute percentage error (MAPE) loss can also be used in accordance with the techniques of this disclosure. [0062] Cross entropy may, in some implementations, be used to quantify the difference between two or more distributions. Cross entropy loss may, in some implementations, be used to train the neural networks of the present disclosure. Cross entropy loss may, in some implementations, involve comparing a predicted probability to a ground truth probability. Other names of cross entropy loss include “logarithmic loss,” “logistic loss,” and “log loss”. A small cross entropy loss may indicate a better (e.g., more accurate) model. Cross entropy loss may be logarithmic. Cross entropy loss may, in some implementations, be applied to binary classification problems. In some implementations, a neural network may be equipped with a sigmoid activation unit at the output to generate a probability prediction. In the case of multi-class classifications, cross entropy may also be used. In such a case, a neural network trained to make multi-class predictions may, in some implementations, be equipped with one or more softmax activation functions at the output (e.g., where there is one output node for class that is to be predicted). Other loss calculation techniques which may be applied in the training of the neural networks of this disclosure include one or more of: Huber loss, Hinge loss, Categorical hinge loss, cosine similarity, Poisson loss, Logcosh loss, or mean squared logarithmic error loss (MSLE). Other loss calculation methods are described herein and may be applied to the training of any of the neural networks described in the present disclosure. [0063] One or more of the neural networks of the present disclosure may, in some implementations, be trained, at least in part by a loss which is based on at least one of: a Point-wise Mesh Euclidean Distance (PMD) and an Earth Mover’s Distance (EMD). Some implementations may incorporate a Hausdorff Distance (HD) calculation into the loss calculation. Computing the Hausdorff distance between two or more 3D representations (such as 3D meshes) may provide one or more technical improvements, in that the HD not only accounts for the distances between two meshes, but also accounts for the way that those meshes are oriented, and the relationship between the mesh shapes in those orientations (or positions or poses). Hausdorff distance may improve the comparison of two or more tooth meshes, such as two or more instances of a tooth mesh which are in different poses (e.g., such as the comparison of predicted setup to ground truth setup which may be performed in the course of computing a loss value for training a setups prediction neural network). [0064] Systems of this disclosure may compute reconstruction error as a combination of L1 loss and MSE loss, as shown in the following line of pseudocode: reconstruction_error = 0.5*L1(all_points_target,all_points_predicted) + 0.5*MSE(all_points_target,all_points_predicted). In the above example, all_points_target may comprise a point cloud corresponding to a ground truth tooth restoration design (or a ground truth example of some other 3D oral care representation). In the above example, all_points_predicted may comprise a point cloud corresponding to a generated example of a tooth restoration design (or a generated example of some other kind of 3D oral care representation). [0065] The entirety of the following paper is incorporated herein by reference in its entirety: "Attention Is All You Need"; Ashish Vaswani, Noam Shazeer, Niki Parmar, Niki Parmar, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin; NIPS 2017. The neural network-based models of this disclosure may provide additional advantages in implementations in which they are integrated with a neural network structure referred to as a “transformer.” [0066] Before recently developed models such as the transformer model, RNN-type models represented the state of the art for natural language processing (NLP). One example application of NLP is the generation of new text based upon prior words or text. Transformers have in turn provided significant improvements over GRU, LSTM and other such RNN-based NLP techniques due to an important attribute of the transformer model, which has the property of multi-headed attention. In some implementations, the NLP concept of multi-headed attention may describe the relationship between each word in a sentence (or paragraph or document or corpus of documents) and each other word in that sentence (or paragraph or document or corpus of documents). These relationships may be generated by a multiheaded attention module, and may be encoded in vector form. This vector may describe how each word in a sentence (or paragraph or document or corpus of documents) should attend to each other word in that sentence (or paragraph or document or corpus of documents). RNN, LSTM and GRU models process a sequence, such a sentence, one word at a time from the start to the end of the sequence. Furthermore, the model may only account for a given subset (called a window) of the sentence when making a prediction. However, transformer-based models may, in some instances, account for the entirety of the preceding text by processing the sequence in its entirety in a single step. Transformer, RNN, LSTM, and GRU models can all be adapted for use in predictive models in digital dentistry and digital orthodontics, particularly for the setup prediction task. In some implementations, an exemplary transformer model for use with 3D meshes and 3D transforms in setups prediction (or other oral care techniques) may be adapted from the Bidirectional Encoder Representation from Transformers (BERT) and/or Generative Pre-Training (GPT) models. For example, a GPT (or BERT) model may first be trained on other data, such as text or documents data, and then be used in transfer learning. Such a transfer learning process may receive a previously trained GPT or BERT model, and then do further training using data comprising 3D oral care representations. Such transfer learning may be performed to train oral care models such as: segmentation, mesh cleanup, coordinate system prediction, setups prediction, validation of 3D oral care representations, transform prediction for placement of oral care meshes (e.g., teeth, hardware, appliance components, fixture model components), tooth restoration design generation (or generation of other 3D oral care representations – such as appliance components, fixture models or archforms), classification of 3D oral care representations, imputation of missing oral care parameters, clustering of clinicians or clustering of clinician preferences, or the like. [0067] Oral care data may comprise one or more of (or combinations of): 3D representations of tooth (e.g., meshes, point clouds or voxels), sections of tooth meshes (such as subsets of mesh elements), tooth transforms (such as in matrix, vector and/or quaternion form, or combinations thereof), transforms for appliance components, transforms for fixture model components, and mesh coordinate system definitions (such as represented by transforms, for example, transformation matrices) and/or other 3D oral care representations described herein. [0068] Transformers may be trained for generating transforms to position teeth into setups poses (or to place appliance components for use in appliance generation or to place fixture model components for use in fixture model generation). Some implementations may operate in an offline prediction context, and some implementations operation in an online reinforcement learning (RL) context. In some implementations, a transformer may be initially trained in an offline context and then undergo further fine-tuning training in the online context. In the offline prediction context, the transformer may be trained from a dataset of cohort patient case data. In the online RL context, the transformer may be trained from either a physics model, or a CAD model, for example. The transformer may learn from static data, such as transformations (e.g., trajectory transformer). In some implementations, the transform may provide a mapping from malocclusion to setup (e.g., receiving transformation matrices as input and generating transformation matrices as ouput). Some implementations of transformers may be trained to process 3D representations, such as 3D meshes, 3D point clouds or voxels (e.g., using a decision transformer) takes as input geometry (e.g., mesh, point cloud, voxels etc.), outputs transformations. The decision transformer may be coupled with a representation generation module that encodes representation of the patient’s dentition (e.g., teeth), such as a VAE, a U-Net, an encoder, a transformer encoder, a pyramid encoder-decoder or a simple dense or fully connected network, or a combination thereof. In some implementations, the representation generation module (e.g., VAE, the U-Net, the encoder, the pyramid encoder-decoder or the dense network for generating the tooth representation) may be trained to generate the representation on one or more teeth. The representation generation module may be trained on all teeth in both arches, only the teeth within the same arch (either upper or lower), only anterior teeth, only posterior teeth, or some other subset of teeth. In some implementations, such a model may be trained on each individual tooth (e.g., an upper right cuspid), so that the model is trained or otherwise configured togenerate highly accurate representations for an individual tooth. In some implementations, an encoder structure may encode such a representation. In some implementations, a decision transformer may learn in an online context, in an offline context or both. An online decision transformer may be trained (e.g., using RL techniques) to output action, state, and/or reward. In some implementations, transformations may be discretized, to allow for piecewise or stepwise actions. [0069] In some implementations, a transformer may be trained to process an embedding of the arch (i.e., to predict transforms for multiple teeth concurrently), to predict a setup. In some implementations, embeddings of individual teeth may be concatenated into a sequence, and then input into the transformer. A VAE may be trained to perform this embedding operation, a U-Net may be trained to perform such an embedding, or a simple dense or fully connected network may be trained, or a combination thereof. In some implementations, the transformer-based techniques of this disclosure may predict an action for an individual tooth, or may predict actions for multiple teeth (e.g., predict transformations for each of multiple teeth). [0070] A 3D mesh transformer may include a transformer encoder structure (which may encode oral care data), and may be followed by a transformer decoder structure. The 3D mesh transformer encoder may encode oral care data into a latent representation, which may be combined with attention information (e.g., to concatenate a vector of attention information to the latent representation). In some implementations, the attention information may help the decoder focus on the relevant oral care data during the decoding process (e.g., to focus on tooth order or mesh element connectivity), so that the transformer decoder can generate a useful output for the 3D mesh transformer (e.g., an output which may be used in the generation of an oral care appliance). Either or both of the transformer encoder or transformer decoder may generate a latent representation. The output of the transformer decoder (or transformer encoder) may be reconstructed using a decoder into, for example, one or more tooth transforms for a setup, one or more mesh element labels for segmentation, coordinate systems transforms for use in coordinate system generation, or one or more points of a point cloud or voxels or other mesh elements for another 3D representation). A transformer may include modules such as one or more of: multi-headed attention modules, feed forward modules, normalization modules, linear modules, and softmax modules, and convolution models for latent vector compression, and/or representation. [0071] The encoder may be stacked one or more times, thereby further encoding the oral care data, and enabling different representations of the oral care data to be learned (e.g., different latent representations). These representations may be embedded with attention information (which may influence the decoder’s focus to the relevant portions of the latent representation of the oral care data) and may be provided to the decoder in continuous form (e.g., as a concatenation of latent representations – such as latent vectors). In some implementations, the encoded output of the encoder (e.g., latent representations) may be used by downstream processing steps in the generation of oral care appliances. For example, the generated latent representation may be reconstructed into transforms (e.g., for the placement of teeth in setups, or the placement of appliance components or fixture model components), or may be reconstructed into 3D representations (e.g., 3D point clouds, 3D meshes or others disclosed herein). Stated another way, the latent representation which is generated by the transformer (e.g., containing continuously encoded attention information) may be provided to a decoder which has been configured to reconstruct the latent representation into the specific data structure which is required by a particular domain area. Continuously encoded attention information may include attention information which has undergone processing by multiple multi-headed attention modules within the transformer encoder or transformer decoder, to name one example. Furthermore, a loss may be computed for a particular domain using data from that domain. The loss calculation may train the transformer decoder to accurately reconstruct the latent representation into the output data structure pertaining to a particular domain. [0072] For example, when the decoder generates a transform for an orthodontic setup, the decoder may be configured with outputs that describe, for example, the 16 real values which comprise a 4x4 transformation matrix (other data structures for describing transforms are possible). Stated a different way, the latent output generated by the transformer encoder (or transformer decoder) may be used to predict setups tooth transforms for one or more teeth, to place those teeth in setup positions (e.g., either final setups or intermediate stages). Such a transformer encoder (or transformer decoder) may be trained, at least in part using a reconstruction loss (or a representation loss, among others described herein) function, which may compare predicted transforms to ground truth (or reference) transforms. [0073] In a further example, when the decoder generates a transform for a tooth coordinate system, the decoder may be configured with outputs that describe, for example, the 16 real values which comprise a 4x4 transformation matrix (other data structures for describing transforms are possible). Stated a different way, the latent output generated by the transformer encoder (or transformer decoder) may be used to predict local coordinate systems for one or more teeth. Such a transformer encoder (or transformer decoder) may be trained, at least in part using a representation loss (or a reconstruction loss, among others described herein) function, which may compare predicted coordinate systems to ground truth (or reference) coordinate systems. [0074] In a further example, when the decoder generates a 3D point cloud (or other 3D representation – such as 3D mesh, voxelized representation, or the like), the decoder may be configured with outputs that describe, for example, one or more 3D points (e.g., comprising XYZ coordinates). Stated a different way, the latent output generated by the transformer encoder (or transformer decoder) may be used to predict mesh elements for a generated (or modified) 3D representation. Such a transformer encoder (or transformer decoder) may be trained, at least in part using a reconstruction loss (or an L1, L2 or MSE loss, among others described herein) function, which may compare predicted 3D representations to ground truth (or reference) 3D representations. [0075] In a further example, when the decoder generates mesh element labels for 3D representation segmentation or 3D representation cleanup, the decoder may be configured with outputs that describe, for example, labels for one or more mesh elements. Stated a different way, the latent output generated by the transformer encoder (or transformer decoder) may be used to predict mesh element labels for mesh segmentation or mesh cleanup. Such a transformer encoder (or transformer decoder) may be trained, at least in part using a cross entropy loss (or others described herein) function, which may compare predicted mesh element labels to ground truth (or reference) mesh element labels. [0076] Multi-headed attention and transformers may be advantageously applied to the setups- generation problem. Multi-headed attention is a module in a 3D transformer encoder network which computes the attention weights for the provided oral care data and produces an output vector with encoded information on how each example of oral care data should attend to each other oral care data in an arch. An attention weight is a quantification of the relationship between pairs of oral care data. [0077] A 3D representation of oral care data (e.g., comprising voxels, a point cloud, or a 3D mesh composed of vertices, faces or edges) may be provided to the transformer. The 3D representation may describe the patient's dentition, a fixture model (or components of a fixture model), an appliance (or components of an appliance), or the like. In some implementations, a transformer decoder (or a transformer encoder) may be equipped with multi-head attention. Multi-headed attention may enable the transformer decoder (or transformer encoder) to attend to different portions of the 3D representation of oral care data. For example, multi-headed attention may enable the transformer to attend to mesh elements within local neighborhoods (or cliques), or to attend to global dependencies between mesh elements (or cliques). For example, multi-headed attention may enable a transformer for setups prediction (e.g., a setups prediction model which is based on a transformer) to generate a transform for a tooth, and to substantially concurrently attend to each of the other teeth in the arch while that transform is generated. Stated another way, the transform for each tooth may be generated in light of the poses of one or more other teeth in the arch, leading to a more accurate transform (e.g., a transform which conforms more closely to the ground truth or reference transform). In the example of 3D representation generation (e.g., the generation of a 3D point cloud), a transformer model may be trained to generate a tooth restoration design. Multi-headed attention may enable the transformer to attend to multiple portions of the tooth (or to the surfaces of the adjacent teeth) while the tooth undergoes the generative process. For example, the transformer for restoration design generation may generate the mesh elements for the incisal edge of an incisor while, at least substantially concurrently, attending to the mesh elements of the mesial, distal, facial or lingual surfaces of the incisor. The result may be the generation of mesh elements to form an incisal edge for the tooth which merges seamlessly with the adjacent surfaces of the tooth. This use of multi-headed attentions results in more accurate modeling of the distribution of the training dataset, over techniques which do not apply multi-headed attention. [0078] In some implementations of the present disclosure, one or more attention vectors may be generated which describe how aspects of the oral care data interacts with other aspects of the oral care data associated with the arch. In some implementations, the one or more attention vectors may be generated to describe how one or more portions of a tooth T1 interact with one or more portions of a tooth T2, a tooth T3, a tooth T4, and so one. A portion of a mesh may be described as a set of mesh elements, as defined herein. In some implementations, the interacting portions of tooth T1 and tooth T2 may be determined, in part, through the calculation of mesh correspondences, as described herein. Any of these models (RNN, GRU, LSTM and Transformer) may be advantageously applied to the task of setups transform prediction, such as in the models described herein. A transformer may be particularly advantageous in that a transformer may enable the transforms for multiple teeth, or even an entire arch to be generated at once, rather than individually, as may be the case with some other models, such as an encoder structure. In other implementations, attention-free transformers may be used to make predictions based on oral care data. [0079] One implementation of the GDL Setups neural network model may include a representation generation module (e.g., containing a U-Net structure, an autoencoder encoder, a transformer encoder, another type of encoder-decoder structure, or an encoder, etc.) which may provide its output to a module which is trained to generate tooth transformers (e.g., a set of fully connected layers with optional skip connections, or an encoder structure) to generate the prediction of a transform for each individual tooth. Skip connections may, in some implementations, connect the outputs of a particular layer in a neural network to the inputs of another later in the neural network (e.g., a layer which is not immediately adjacent to the originating layer). The transform-generation module (e.g., an encoder) may handle the transform prediction one tooth at a time. Other implementations may replace this encoder structure with a transformer (e.g., transformer encoder or transformer decoder), which may handle all the predictions for all teeth substantially concurrently. Stated another way, a transformer may be configured to receive a large number of input values, larger than some other neural network models (e.g., than a typical MLP). This is because an increased number of inputs may be accommodated by the transformer, the predictions corresponding to those inputs may be generated substantially concurrently. The representation generation module (e.g., U-Net structure) may provide its output to the transformer, and the transformer may generate the setups transforms for all of the several teeth at once, with the technical advantage of improved accuracy (because the transforms for each tooth is generated in light of the transform for each of the adjacent or nearby teeth – leading to fewer collisions and better conformance with the goals of treatment). A transformer may be trained to output a transformation, such as a transform encoded by a 4x4 matrix (or some other size), a quaternion, a translation vector, Euler angles or some other form. The transformation may place a tooth into a setups pose, may place a fixture model component into a pose suitable for fixture model generation, or may place an appliance component into a pose suitable for appliance generation (e.g., dental restoration appliance, clear tray aligner, etc.). In some implementations, the transform may define a coordinate system for aspects of the patient’s dentition, such as a tooth mesh (e.g., a local coordinate system for a tooth). In some implementations, the inputs to the transformer may first be encoded using a neural network (e.g., a latent representation or embedding may be generated), such as one or more linear layers, and/or one or more convolutional layers. In some implementations, the transformer may first be trained on an offline dataset, and subsequently be trained using a secondary actor-critic network, which may enable online reinforcement learning. [0080] Transformers may, in some implementations, enable large model capacity and/or enable an attention mechanism (e.g., the capability to pay attention and respond to certain inputs). The attention mechanisms (e.g., multi-headed attention) that are found within transformers may enable intra-sequence relationships to be encoded into neural network features. Intra-sequence relationships may be encoded, for example, by associating an order number (e.g., 1, 2, 3, etc.) with each tooth in an arch, or by associating an order number with each mesh element in a 3D representation (e.g., of a tooth). In implementations where latent vectors of teeth are provided to the transformer, intra-sequence relationships may be encoded, for example, by associating an order number (e.g., 1, 2, 3, etc.) with each element in the latent vector. [0081] Transformers may be scaled by increasing the number of attention heads and/or by increasing the number of transformer layers. Stated differently, one or more aspects of a transformer may be independently trained to handle discrete tasks, and later combined to allow the resulting transformer to perform all of the tasks for which the individual components had been trained, without degrading the predictive accuracy of the neural network. Scaling a convolutional network may be more difficult, because the models may be less malleable or may be less interchangeable. [0082] Convolution has an ability to be rotation and translation invariant, which leads to improved generalization, because a convolution model may not need to account for the manner in which the input data in rotated or translated. Transformers have an ability to be permutation invariant, because intra- sequence relationships may be encoded into neural network features. [0083] In some implementations for the generation or modification of 3D oral care representations, transformers may be combined with convolution-based neural networks, such as by vertically stacking convolution layers and attention layers. Stacking transformer blocks with convolutional blocks enables the resulting structure to have the translation invariance of convolution, and also the permutation invariance of a transformer. Such stacking may improve model capacity and/or model generalization. CoAtNet is an example of a network architecture which combines convolutional and attention-based elements and may be applied to the processing of oral care data. In some instances, a network for the modification or generation of 3D oral care representations may be trained, at least in part, from CoAtNet (or another model that combines convolution and self-attention/transformers) using transfer learning. [0084] PCT Application with Publication No. WO2020026117A1 is incorporated herein by reference in its entirety. WO2020026117A1 lists some examples of Orthodontic Metrics (OM). Further examples are disclosed herein. The orthodontic metrics may be used to quantify the physical arrangement of an arch of teeth for the purpose of orthodontic treatment (as opposed to restoration design metrics – which pertain to dentistry and describe the shape and/or form of one or more pre-restoration teeth, for the purpose of supporting dental restoration). These orthodontic metrics can measure how badly maloccluded the arch is, or conversely the metrics can measure how correctly arranged the teeth are. In some implementations, the GDL Setups model (or RL Setups, VAE Setups, Capsule Setups, MLP Setups, Diffusion Setups, PT Setups, Similarity Setups and FDG Setups) may incorporate one or more of these orthodontic metrics, or other similar or related orthodontic metrics. In some implementations, such orthodontic metrics may be incorporated into the feature vector for a mesh element, where these per- element feature vectors are provided to the setups prediction network as inputs. In some implementations, such orthodontic metrics may be directly consumed by a generator, an MLP, a transformer, or other neural network as direct inputs (such as presented in one or more input vectors of real numbers S, such as described elsewhere in this disclosure. The use of such orthodontic metrics in the training of the generator may improve the performance (i.e., correctness) of the resulting generator, resulting in predicted transforms which place teeth more nearly in the correct final setups poses than would otherwise be possible. Such orthodontic metrics may be consumed by an encoder structure or by a U-Net structure (in the case of GDL Setups). Such orthodontic metrics may be consumed by an autoencoder, variational autoencoder, masked autoencoder or regularized autoencoder (in the case of the VAE Setups, VAE Mesh Element Labelling, MAE Mesh In-Filling). Such orthodontic metrics may be consumed by a neural network which generates action predictions as a part of a reinforcement learning RL Setups model. Such orthodontic metrics may be consumed by a classifier which applies a label to a setup arch (e.g., labels such as mal, staging or final setup). This description is non-limiting, as the orthodontic metrics may also be incorporated in other ways into the various techniques of this disclosure. [0085] The various loss calculations of the present disclosure may, in some examples, incorporate one or more orthodontic metrics, with the advantage of improving the correctness of the resulting neural network. An orthodontic metric may be used to directly compare a predicted example to the corresponding ground truth example (such as is done with the metrics in the Setups Comparison description). In other examples, one or more orthodontic metrics may be taken from this section and incorporated into a loss computation. Such an orthodontic metric may be computed on the predicted example, and then the orthodontic metric would also be computed on the ground truth example. These two orthodontic metrics results would then be consumed by the loss computation, with the advantage of improving the performance of the resulting neural network. In some implementations, one or more orthodontic metrics pertaining to the alignment of two or more adjacent teeth may be computed and incorporated into a loss function, for example, to train, at least in part, a setups prediction neural network. In some implementations, such an orthodontic metric may facilitate the network in aligning the mesial surface of one tooth with distal surface of adjacent tooth. Backpropagation is an exemplary algorithm by which a neural network may be trained using one or more loss values. [0086] In some implementations, one or more orthodontic metrics may be used to evaluate the predicted output of a neural network, such as a setups prediction. Such a metric(s) may enable the training algorithm to determine how close the predicted output is to an acceptable output, for example, in a quantified sense. In some implementations, this use of an orthodontic metric may enable a loss value to be computed which does not depend entirely on a comparison to a ground truth. In some implementations, such a use of an orthodontic metric may enable loss calculation and network training to proceed without the need for a comparison against a ground truth example. The advantage of such an approach is that loss may be computed based on a general principle or specification for the predicted output (such as a setup) rather than tying loss calculation to a specific ground truth example (which may have been defined by a particular doctor, clinician, or technician, whose treatment philosophy may differ from that of other technicians or doctors). In some implementations, such an orthodontic metric may be defined based on a FID (Frechet Inception Distance) score. [0087] The following is a description of some of the orthodontic metrics which are used to quantify the state of a set of teeth in an arch for the purpose of orthodontic treatment. These orthodontic metrics indicate the degree of malocclusion that the teeth are in at a given stage of clear tray aligner treatment. [0088] An orthodontic metric that can be computed using tensors may be especially advantageous when training one of the neural networks of the present disclosure, since tensor operations may promote efficient computations. The more efficient (and faster) the computation, the faster the rate at which training can proceed. [0089] In some examples, an error pattern may be identified in one or more predicted outputs of an ML model (e.g., a transformation matrix for a predicted tooth setup, a labelling of mesh elements for mesh cleanup, an addition of mesh elements to a mesh for the purpose of mesh in-filling, a classification label for a setup, a classification label for a tooth mesh, etc.). One or more orthodontic metrics may be selected to become an input to the next round of ML model training, to address any pattern of errors or deficiencies which may be identified in the one or more predicted outputs. [0090] Some OM may be defined relative to an archfrom coordinate frame, the LDE coordinate system. In some implementations, a point may be described using an LDE coordinate frame relative to an archform, where L, D and E correspond to: 1) Length along the curve of the archform, 2) Distance away from the archform, and 3) distance in the direction perpendicular to the L and D axes (which may be termed Eminence), respectively. [0091] Various of the OM and other techniques of the present disclosure may compute collisions between 3D representations (e.g., of oral care objects, such as teeth). Such collisions may be computed as at least one of: 1) penetration distance between 3D tooth representations, 2) count of overlapping mesh elements between 3D tooth representations, and 3) volume of overlap between 3D tooth representations. In some implementations, an OM may be defined to quantify the collision of two or more 3D representations of oral care structures, such as teeth. Some optimization algorithms, such as setups prediction techniques, may seek to minimize collisions between oral care structures (such as teeth). Between-arch orthodontic metrics are as follows. [0092] Six (6) metrics for the comparison of two or more arches are listed below. Other suitable comparison orthodontic metrics are found elsewhere in this disclosure, such as in the section for the Setups Comparison technique. 1. Rotation geodesic distance (rotation between predicted example and ground truth setup example) 2. Translation distance (gap between predicted example and ground truth setup example) 3. Normalized translation distance 4. 3D alignment error that measures the distance between predicted mesh elements and ground truth mesh elements, in units of mm. 5. Normalized 3D alignment 6. Percent overlap (% overlap) by volume (alternatively % overlap by mesh elements) of predicted example and corresponding ground truth example [0093] Within-arch orthodontic metrics are as follows. Alignment - A 3D tooth orientation vector may be calculated using the tooth's mesial-distal axis. A 3D vector, which may be tangent vector to the archform at the position of the tooth may also be calculated. The XY components (i.e., which may be 2D vectors) may then be used to compare the orientation of the archform at the tooth's location to the tooth's orientation in XY space. Cosine similarity may be used to calculate the 2D orientation difference (angle) between the archform tangent and the tooth's mesial-distal axis. Arch Symmetry - For each left-right pair of teeth (e.g., lower left lateral incisor and/or lower right lateral incisor) the absolute difference may be calculated between each tooth’s X-coordinate and the global coordinate reference frame’s X-axis. This delta may indicate the arch asymmetry for a given tooth pair. The result of such a calculation may be the mean X-axis delta of one or more tooth-pairs from the arch. This calculation may, in some implementations, be performed relative to the Y-axis with y-coordinates (and/or relative to the Z axis with Z-coordinates). Archform D-axis Differences – May compute the D dimension difference (i.e., the positional difference in the facial-lingual direction) between two arch states, for one or more teeth. May, in some implementations, return a dictionary of the D-direction tooth movement for each tooth, with tooth UNS number as the key. May use the LDE coordinate system relative to an archform. Archform (Lower) Length Ratio – May compute the ratio between the current lower arch length and the arch length as it was in the original maloccluded lower arch. Archform (Upper) Length Ratio – May compute the ratio between the current upper arch length and the arch length as it was in the original maloccluded upper arch. Archform Parallelism (Full arch) - For at least one local tooth coordinate system origin in the upper arch, the one or more nearest origins (e.g., tooth local coordinate system origins) in the lower arch. In some implementations, the two nearest origins may be used. May compute the straight line distance from the upper arch point to the line formed between the origins of the two teeth in the opposing (lower) arch. May return the standard deviation of the set of “point-to-line" distances mentioned above, where the set may be composed of the point-to-line distances for each tooth in the arch. Archform Parallelism (Individual tooth) – This metric may share some computational elements with the archform_parallelism_global orthodontic metric, except that this metric may input the mean distance from a tooth origin to the line formed by the neighboring teeth in opposing arches (e.g., a tooth in the upper arch and the corresponding tooth in the lower arch). The mean distance may be computed for one or more such pairs of teeth. In some implementations, this may be computed for all pairs of teeth. Then the mean distance may be subtracted from the distance that is computed for each tooth pair. This OM may yield the deviation of a tooth from a “typical” tooth parallelism in the arch. Buccolingual Inclination - For at least one molar or premolar, find the corresponding tooth on the opposite side of the same arch (i.e., for a tooth on the left side of the arch, find the same type of tooth on the right side and vice versa). This OM may compute an n-element list for each tooth (e.g. n may equal 2). This list may contain at least the tooth IDs of the teeth in each pair of teeth (e.g., LeftLowerFirstMolar and RightLowerFirstMolar in a list = [left_tooth_idx_1, right_tooth_idx_2]). Such an n-element vector may be computed for each molar and each premolar in the upper and lower arches. The buccal cusps maybe identified on the molars and premolars on each of the left and right sides of the arch. Draw a line between the buccal cusps of the left tooth and the buccal cusps on the right tooth. Make a plane using this line and the z-axis of the arch. The lingual cusps may be projected onto the plane (i.e., at this point the angle of inclination may be determined). By performing an additional projection, the approximate vertical distance between the lingual cusps and the buccal cusps may be computed. This distance may be used as the buccolingual inclination OM. Canine Overbite - The upper and lower canines may be identified. The first premolar for the given side of the mouth may be identified. On a given side of the arch, a distance may be computed between the upper canine and the lower canine, and also between the upper pre-molar and the lower pre-molar. The average (or median, or mode or some other statistic) may be computed for the measured distances. The z- component of this result indicates the degree of overbite. Overbite may be computed between any tooth in one arch and the corresponding tooth in the other arch. Canine Overjet Contact – May calculate the collisions (e.g., collision distances) between pairs of canines on opposing arches. Canine Overjet Contact KDE – May take an orthodontic metric score for the current patient case as input, and may convert that score into to a log-likelihood using a previously trained kernel density estimation (KDE) model or distribution. This operation may yield information about where in the distribution of "typical" values this patient case lies. Canine Overjet – This OM may share some computational steps with the canine overbite OM. In some implementations, average distances may be computed. In some implementations, the distance calculation may compute the Euclidean distance of the XY components of a tooth in the upper arch and a tooth in the lower arch, to yield overjet (i.e., as opposed to computing the difference in Z-components, as may be performed for canine overbite). Overjet may be computed between any tooth in one arch and the corresponding tooth in the other arch. Canine Class Relationship (also applies to first, second and third molars) – This OM may, in some implementations comprise two functions (e.g., written in Python). get_canine_landmarks(): Get landmarks for each tooth which may be used to compute the class relationship, and then, in some implementations, map those landmarks onto the global coordinate space so that measurements may be made between teeth. class_relationship_score_by_side(): May compute the average position of at least one landmark on at least one tooth in the lower arch, and may compute the same for the upper arch. Then may compute the vector from the upper arch landmark position to the lower arch landmark position, and finally projects this vector onto the lower arch to yield a quantification (e.g., as a scalar) of the amount of delta in “arch l-axis" position there is. This OM may compute how far forward or behind the tooth is positioned on the l-axis relative to the tooth or teeth of interest in the opposing arch. Crossbite - Fossa in at least one upper molar may be located by finding the halfway point between distal and mesial marginal ridge saddles of the tooth. A lower molar cusp may lie between the marginal ridges of the corresponding upper molar. This OM may compute a vector from the upper molar fossa midpoint to the lower molar cusp. This vector may be projected onto the d-axis of the archform, yielding a lateral measure of distance from the cusp to the fossa. This distance may define the crossbite magnitude. Edge Alignment – This OM may identify the leftmost and rightmost edges of a tooth, and may identify the same for that tooth’s neighbor. The OM may then draw a vector from the leftmost edge of the tooth to the leftmost edge of the tooth’s neighbor. The OM may then draw a vector from the rightmost edge of the tooth to the rightmost edge of the tooth’s neighbor. The OM may then calculates the linear fit error between the two vectors. Such a calculation may involve making two vectors: Vec_tooth = right_tooths_leftside to left_tooths_leftside Vec_neighbor = right_tooths_rightside to left_tooths_leftside And then may involve computing the dot-product of these two vectors and subtracting the result from 1. (i.e., EdgeAlignment score = 1 - abs(dot(Vec_tooth, Vec_neighbor)) ). A score of 0 may indicate perfect alignment. A score of 1 may mean perpendicular alignment. Incisor Interarch Contact KDE – May identify the deviation of the IncisorInterarchContact from the mean of a modeled distribution of such statistics across a dataset of one or more other patient cases. Leveling – May compute a measure of leveling between a tooth and its neighbor. This OM may calculate the difference in height between two or more neighboring teeth. For molars, this OM may use the midpoint between the mesial and distal saddle ridges as the height of the molar. For non-molar teeth, this OM may use the length of the crown from gums to tip. In some implementations, the tip may be the origin of the local coordinate space of the tooth. Other implementations may place the origin in other locations. A simple subtraction between the heights of neighboring teeth may yield the leveling delta between the teeth (e.g., by comparing Z components). Midline – May compute the position of the midline for the upper incisors and/or the lower incisors, and then may compute the distance between them. Molar Interarch Contact KDE – May compute a molar interarch contact score (i.e., a collision depth or other type of collision), and then may identify where that score lies in a pre-defined KDE (distribution) built from representative cases. Occlusal Contacts – For a particular tooth from the arch, this OM may identify one or more landmarks (e.g., mesial cusp, or central cusp, etc.). Get the tooth transform for that tooth. For each cusp on the current tooth, the cusp may be scored according to how well the cusp contacts the neighboring (corresponding) tooth in the opposite arch. A vector may be found from the cusp of the tooth in question to the vertical intersection point in the corresponding tooth of the opposing arch. The distance and/or direction (i.e., up or down) to the opposing arch may be computed. A list may be returned that contains the resulting signed distances, one for each cusp on the tooth in question. Overbite – The upper and lower central incisors may be compared along the z-axis. The difference along the z-axis may be used as the overbite score. Overjet – The upper and lower central incisors may be compared along the y-axis. The difference along the y-axis may be used as the overjet score. Molar Interarch Contact – May calculate the contact score between molars, and may use collision measurement(s) (such as collision depth). Root Movement d – The tooth transforms for an initial state and a next state may be recieved. The archform axes at a point L along the archform may be computed. This OM may return a distance moved along the d-axis. This may be accomplished by projecting the root pivot point onto the d-axis. Root Movement l – The tooth transforms for an initial state and a next state may be received. The archform axes at a point L along the archform may be computed. This OM may return a distance moved along the l-axis. This may be accomplished by projecting the root pivot point onto the l-axis. Spacing – May compute the spacing between each tooth and its neighbor. The transforms and meshes for the arch may be received. The left and right edges of each tooth mesh may be computed. One or more points of interest may be transformed from local coordinates into the global arch coordinate frame. The spacing may be computed in a plane (e.g., the XY plane) between each tooth and its neighbor to the "left". May return an array of one or more Euclidean distances (e.g., such as in the XY plane) which may represent the spacing between each tooth and its neighbor to the left. Torque – May compute torque (i.e., rotation around and axis, such as the x-axis). For one or more teeth, one or more rotations may be converted from Euler angles into one or more rotation matrices. A component (such as a x-component) of the rotations may be extracted and converted back into Euler angles. This x- component may be interpreted as the torque for a tooth. A list may be returned which contains the torque for one or more teeth, and may be indexed by the UNS number of the tooth. [0094] In some implementations, a curve-of-spee metric may measure the curvature of the occlusal or incisal surfaces of the teeth on either the left or right sides of the arch, with respect to the occlusal plane. The occlusal plane may, in some instances, be computed as a surface which averages the incisal or occlusal surfaces of the teeth (for one or both arches). In some implementations, a curvature metric may be computed along a normal vector, such as a vector which is normal to the occlusal plane. In other implementations, a curvature metric may be computed along the normal vector of another plane. In some implementations, an XY plane may be defined to correspond to the occlusal plane. An orthogonal plane may be defined as the plane that is orthogonal to the occlusal plane, which also passes through a curve- of-spee line segment, where the curve-of-spee line segment is defined by a first endpoint which is a landmarking point on a first tooth (e.g., canine) and a second endpoint which is a landmarking point on the most-posterior tooth of the same side of the arch. A landmarking point can in some implementations be located along the incisal edge of a tooth or on the cusp of a tooth. In some instances, the landmarking points for the intermediate teeth (e.g., teeth which are located between the first tooth and the most posterior tooth) on either the left or right sides of the arch may form a curved path, such as may be described by a polyline. The following is a non-limiting list of curve-of-spee oral care metrics. [0095] 1) Measure the vertical height between a line segment and a point. Stated another way, measure a distance between a line segment and a point along the z-axis. The line segment is defined by joining the highest cusp of the most-posterior tooth (in the lower arch) and the cusp of the first tooth on that side (in the lower arch). Given the subset of teeth between the first tooth and the most-posterior tooth, the point is defined by the highest cusp of the lowest tooth of this subset. Stated another way, a curve-of-spee metric may be computed using the following 4 steps. i) Line: Form a line between the highest cusp on the most posterior tooth and the cusp of the first tooth. ii) Curve_Point_A: Given the set of teeth between the most posterior tooth and the first tooth, find the highest point of the lowest tooth. iii) Curve_Point_B: Project Curve_Point_A onto the Line to find a point (Curve_Point_B) along the line that is closest to Curve_Point_A. iv) Curve-Of-Spee: Find the height difference between Curve_Point_B and Curve_Point_A. [0096] 2) Project one or more intermediate landmark points (e.g., points on the teeth which lie between the first tooth and the most-posterior tooth on that side of the arch) and the curve-of-spee line segment onto the orthogonal plane. Compute the curve-of-spee metric by measuring the distance between the farthest of the projected intermediate points to the projected curve-of-spee line segment. This yields a measure for the curvature of the arch relative to the orthogonal plane. [0097] 3) Project one or more intermediate landmark points and the curve-of-spee line segment onto the occlusal plane. Compute Curve of Spee in this plane by measuring the distance between the farthest of the projected intermediate points to the projected curve-of-spee line segment. This yields a measure for the curvature of the arch relative to the occlusal plane. [0098] 4) Skip the projection and compute the distances and curvatures in the 3D space. Compute Curve of Spee by measuring the distance between the farthest of the intermediate points to the curve-of- spee line segment. This yields a measure for the curvature of the arch in 3D space. [0099] 5) Compute the slope of the projected curve-of-spee line segment on the occlusal plane. [00100] 6) Compute the slope of the projected curve-of-spee line segment in the orthogonal plane. [00101] Curve-of-spee metrics 5 and 6 may help the network to reduce some more degrees of freedom in defining how the patient’s arch is curved in the posterior of the mouth. [00102] The neural networks of this disclosure may exploit one or more benefits of the operation of parameter tuning, whereby the inputs and parameters of a neural network are optimized to produce more data-precide results. One parameter which may be tuned is neural network learning rate (e.g., which may have values such as 0.1, 0.01, 0.001, etc.). Data augmentation schemes may also be tuned or optimized, such as schemes where “shiver” is added to the tooth meshes before being input to the neural network (i.e., small random rotations, translations and/or scaling may be applied to vary the dataset and make the neural network robust to variations in data). A subset of the neural network model parameters available for tuning are as follows: o Learning rate (LR) decay rate (e.g., how much the LR decays during a training run) o Learning rate (LR). The floating-point value (e.g., 0.001) that is used by the optimizer. o LR schedule (e.g., cosine annealing, step, exponential) o Voxel size (for cases with sparse mesh processing operations) o Dropout % (e.g., dropout which may be performed in a linear encoder) o LR decay step size (e.g., decay every 10 or 20 or 30 epochs) o Model scaling, which may increase or decrease the count of layers and/or the count of parameters per layer. [00103] Parameter tuning may be advantageously applied to the training of a neural network for the prediction of final setups or intermediate staging to provide data precision-oriented technical improvements. Parameter tuning may also be advantageously applied to the training of a neural network for mesh element labeling or a neural network for mesh in-filling. In some examples, parameter tuning may be advantageously applied to the training of a neural network for tooth reconstruction. In terms of classifier models of this disclosure, parameter tuning may be advantageously applied to a neural network for the classification of one or more setups (i.e., classification of one or more arrangements of teeth). The advantage of parameter tuning is to improve the data precision of the output of a predictive model or a classification model. Parameter tuning may, in some instances, provide the advantage of obtaining the last remaining few percentage points of validation accuracy out of a predictive or classification model. [00104] Various neural network models of this disclosure may draw benefits from data augmentation. Examples include models of this which are trained on 3D meshes, such as GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups, Diffusion Setups, PT Setups, Similarity Setups, FDG Setups, Setups Classification, Setups Comparison, VAE Mesh Element Labeling, MAE Mesh In-filling, Mesh Reconstruction VAE, and Validation Using Autoencoders. Data augmentation, such as by way of the method shown in FIG.1, may increase the size of the training dataset of dental arches. Data augmentation can provide additional training examples by adding random rotations, translations, and/or rescaling to copies of existing dental arches. In some implementations of the techniques of this disclosure, data augmentation may be carried out by perturbing or jittering the vertices of the mesh, in a manner similar to that described in (“Equidistant and Uniform Data Augmentation for 3D Objects”, IEEE Access, Digital Object Identifier 10.1109/ACCESS.2021.3138162). The position of a vertex may be perturbed through the addition of Gaussian noise, for example with zero mean, and 0.1 standard deviation. Other mean and standard deviation values are possible in accordance with the techniques of this disclosure. [00105] FIG.1 shows a data augmentation method that systems of this disclosure may apply to 3D oral care representations. A non-limiting example of a 3D oral care representation is a tooth mesh or a set of tooth meshes. Tooth data 100 (e.g., 3D meshes) are received at the input. The systems of this disclosure may generate copies of the tooth data 100 (102). In the example of FIG.1, the systems of this disclosure may apply one or more stochastic rotations to the tooth data 100 (104). In the example of FIG. 1, the systems of this disclosure may apply stochastic translations to the tooth data 100 (106). The systems of this disclosure may apply stochastic scaling operations to the tooth data 100 (108). The systems of this disclosure may apply stochastic perturbations to one or more mesh elements of the tooth data 100 (110). The systems of this disclosure may output augmented tooth data 112 that are formed by way of the method of FIG.1. [00106] Because generator networks of this disclosure can be implemented as one or more neural networks, the generator may contain an activation function. When executed, an activation function outputs a determination of whether or not a neuron in a neural network will fire (e.g., send output to the next layer). Some activation functions may include: binary step functions, or linear activation functions. Other activation functions impart non-linear behavior to the network, including: sigmoid/logistic activation functions, Tanh (hyperbolic tangent) functions, rectified linear units (ReLU), leaky ReLU functions, parametric ReLU functions, exponential linear units (ELU), softmax function, swish function, Gaussian error linear unit (GELU), or scaled exponential linear unit (SELU). A linear activation function may be well suited to some regression applications (among other applications), in an output layer. A sigmoid/logistic activation function may be well suited to some binary classification applications (among other applications), in an output layer. A softmax activation function may be well suited to some multiclass classification applications (among other applications), in an output layer. A sigmoid activation function may be well suited to some multilabel classification applications (among other applications), in an output layer. A ReLU activation function may be well suited in some convolutional neural network (CNN) applications (among other applications), in a hidden layer. A Tanh and/or sigmoid activation function may be well suited in some recurrent neural network (RNN) applications (among other applications), for example, in a hidden layer. There are multiple optimization algorithms which can be used in the training of the neural networks of this disclosure (such as in updating the neural network weights), including gradient descent (which determines a training gradient using first-order derivatives and is commonly used in the training of neural networks), Newton's method (which may make use of second derivatives in loss calculation to find better training directions than gradient descent, but may require calculations involving Hessian matrices), and conjugate gradient methods (which may yield faster convergence than gradient descent, but do not require the Hessian matrix calculations which may be required by Newton's method). In some implementations, additional methods may be employed to update weights, in addition to or in place of the techniques described above. These additional methods include the Levenberg-Marquardt method and/or simulated annealing. The backpropagation algorithm is used to transfer the results of loss calculation back into the network so that network weights can be adjusted, and learning can progress. [00107] Neural networks contribute to the functioning of many of the applications of the present disclosure, including but not limited to: GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups, Diffusion Setups, PT Setups, Similarity Setups, Tooth Classification, Setups Classification, Setups Comparison, VAE Mesh Element Labeling, MAE Mesh In-filling, Mesh Reconstruction Autoencoder, Validation Using Autoencoders, imputation of oral care parameters, 3D mesh segmentation (3D representation segmentation), Coordinate System Prediction, Mesh Cleanup, Restoration Design Generation, Appliance Component Generation and/or Placement, or Archform Prediction. The neural networks of the present disclosure may embody part or all of a variety of different neural network models. Examples include the U-Net architecture, multi-later perceptron (MLP), transformer, pyramid architecture, recurrent neural network (RNN), autoencoder, variational autoencoder, regularized autoencoder, conditional autoencoder, capsule network, capsule autoencoder, stacked capsule autoencoder, denoising autoencoder, sparse autoencoder, conditional autoencoder, long/short term memory (LSTM), gated recurrent unit (GRU), deep belief network (DBN), deep convolutional network (DCN), deep convolutional inverse graphics network (DCIGN), liquid state machine (LSM), extreme learning machine (ELM), echo state network (ESN), deep residual network (DRN), Kohonen network (KN), neural Turing machine (NTM), or generative adversarial network (GAN). In some implementations, an encoder structure or a decoder structure may be used. Each of these models provides one or more of its own particular advantages. For example, a particular neural networks architecture may be especially well suited to a particular ML technique. For example, autoencoders are particularly suited to the classification of 3D oral care representations, due to the ability to encode the 3D oral care representation into a form which is more easily classifiable. [00108] In some implementations, the neural networks of this disclosure can be adapted to operate on 3D point cloud data (alternatively on 3D meshes or 3D voxelized representation). Numerous neural network implementations may be applied to the processing of 3D representations and may be applied to training predictive and/or generative models for oral care applications, including: PointNet, PointNet++, SO-Net, spherical convolutions, Monte Carlo convolutions and dynamic graph networks, PointCNN, ResNet, MeshNet, DGCNN, VoxNet, 3D-ShapeNets, Kd-Net, Point GCN, Grid-GCN, KCNet, PD-Flow, PU-Flow, MeshCNN and DSG-Net. Oral care applications include, but are not limited to: setups prediction (e.g., using VAE, RL, MLP, GDL, Capsule, Diffusion, etc. which have been trained for setups prediction), 3D representation segmentation, 3D representation coordinate system prediction, element labeling for 3D representation clean-up (VAE for Mesh Element labeling), in-filling of missing elements in 3D representation (MAE for Mesh In-Filling), dental restoration design generation, setups classification, appliance component generation and/or placement, archform prediction, imputation of oral care parameters, setups validation, or other validation applications and tooth 3D representation classification. [00109] Some implementations of the techniques of this disclosure incorporate the use of an autoencoder. Autoencoders that can be used in accordance with aspects of this disclosure include but are not limited to: AtlasNet, FoldingNet and 3D-PointCapsNet. Some autoencoders may be implemented based on PointNet. [00110] Representation learning may be applied to setups prediction techniques of this disclosure by training a neural network to learn a representation of the teeth, and then using another neural network to generate transforms for the teeth. Some implementations may use a VAE or a Capsule Autoencoder to generate a representation of the reconstruction characteristics of the one or more meshes related to the oral care domain (including, in some instances, information about the structures of the tooth meshes). Then that representation (either a latent vector or a latent capsule) may be used as input to a module which generates the one or more transforms for the one or more teeth. These transforms may in some implementations place the teeth into final setups poses. These transforms may in some implementations place the teeth into intermediate staging poses. In some implementations, a transform may be described by a 9x1 transformation vector (e.g., that specifies a translation vector and a quaternion). In other implementations, a transform may be described by a transformation matrix (e.g., a 4x4 affine transformation matrix). [00111] In some implementations, systems of this disclosure may implement a principal components analysis (PCA) on an oral care mesh, and use the resulting principal components as at least a portion of the representation of the oral care mesh in subsequent machine learning and/or other predictive or generative processing. [00112] An autoencoder may be trained to generate a latent form of a 3D oral care representation. An autoencoder may contain a 3D encoder (which encodes a 3D oral care representation into a latent form), and/or a 3D decoder (which reconstructs that latent from into a facsimile of the inputted 3D oral care representation). Although this disclosure refers to 3D encoders and 3D decoders, the term 3D should be interpreted in a non-limiting fashion to encompass multi-dimensional modes of operation. For example, systems of this disclosure may train multi-dimensional encoders and/or multi-dimensional decoders. [00113] Systems of this disclosure may implement end-to-end training. Some of the end-to-end training-based techniques of this disclosure may involve two or more neural networks, where the two or more neural networks are trained together (i.e., the weights are updated concurrently during the processing of each batch of input oral care data). End-to-end training may, in some implementations, be applied to setups prediction by concurrently training a neural network which learns a representation of the teeth, along with a neural network which generates the tooth transforms. [00114] According to some of the transfer learning-based implementations of this disclosure, a neural network (e.g., a U-Net) may be trained on a first task (e.g., such as coordinate system prediction). The neural network trained on the first task may be executed to provide one or more of the starting neural network weights for the training of another neural network that is trained to perform a second task (e.g., setups prediction). The first network may learn the low-level neural network features of oral care meshes and be shown to work well at the first task. The second network may exhibit faster training and/or improved performance by using the first network as a starting point in training. Certain layers may be trained to encode neural network features for the oral care meshes that were in the training dataset. These layers may thereafter be fixed (or be subjected to minor changes over the course of training) and be combined with other neural network components, such as additional layers, which are trained for one or more oral care tasks (such as setups prediction). In this manner, a portion of a neural network for one or more of the techniques of the present disclosure (e.g., setups prediction) may receive initial training on another task, which may yield important learning in the trained network layers. This encoded learning may then be built upon with further task-specific training of another network. [00115] In accordance with this disclosure, transfer learning may be used for setups prediction, as well as for other oral care applications, such as mesh classification (e.g., tooth or setups classification), mesh element labeling, mesh element in-filling, procedure parameter imputation, mesh segmentation, coordinate system prediction, restoration design generation, mesh validation (for any of the applications disclosed herein). In some implementations, a neural network trained to output predictions based on oral care meshes may first be partially trained on one of the following publicly available datasets, before being further trained on oral care data: Google PartNet dataset, ShapeNet dataset, ShapeNetCore dataset, Princeton Shape Benchmark dataset, ModelNet dataset, ObjectNet3D dataset, Thingi10K dataset (which is especially relevant to 3D printed parts validation), ABC: A Big CAD Model Dataset For Geometric Deep Learning, ScanObjectNN, VOCASET, 3D-FUTURE, MCB: Mechanical Components Benchmark, PoseNet dataset, PointCNN dataset, MeshNet dataset, MeshCNN dataset, PointNet++ dataset, PointNet dataset, or PointCNN dataset. [00116] In some implementations, a neural network which was previously trained on a first dataset (either oral care data or other data) may subsequently receive further training on oral care data and be applied to oral care applications (such as setups prediction). Transfer learning may be employed to further train any of the following networks: GCN (Graph Convolutional Networks), PointNet, ResNet or any of the other neural networks from the published literature which are listed above. [00117] In some implementations, a first neural network may be trained to predict coordinate systems for teeth (such as by using the techniques described in WO2022123402A1 or US Provisional Application No. US63/366492). A second neural network may be trained for setups prediction, according to any of the setups prediction techniques of the present disclosure (or a combination of any two or more of the techniques described herein). Transfer learning may transfer at least a portion of the knowledge or capability of the first neural network to the second neural network. As such, transfer learning may provide the second neural network an accelerated training phase to reach convergence. In some implementations, the training of the second network may, after being augmented with the transferred learning, then be completed using one or more of the techniques of this disclosure. [00118] Systems of this disclosure may train ML models with representation learning. The advantages of representation learning include the fact that the generative network (e.g., neural network that predicts a transform for use in setups prediction) can be configured to receive input with a known size and/or standard format, as opposed to receiving input with a variable size or structure. Representation learning may produce improved performance over other techniques, because noise in the input data may be reduced (e.g., because the representation generation model extracts hierarchical neural network features and/or reconstruction characteristics of an inputted representation (e.g., a mesh or point cloud) through loss calculations or network architectures chosen for that purpose). [00119] Reconstruction characteristics may comprise values in of a latent representation (e.g., a latent vector) that describe aspects of the shape and/or structure of the 3D representation that was provided to the representation generation module that generated the latent representation. The weights of the encoder module of a reconstruction autoencoder, for example, may be trained to encode a 3D representation (e.g., a 3D mesh, or others described herein) into a latent vector representation (e.g., a latent vector). Stated another way, the capability to encode a large set (e.g., hundreds, thousands or millions) of mesh elements into a latent vector (e.g., of hundreds or a thousand real values – e.g., 512, 1024, etc.) may be learned by the weights of the encoder. Each dimension of that latent vector may contain a real number which describes some aspect of the shape and/or structure of the original 3D representation. The weights of the decoder module of the reconstruction autoencoder may be trained to reconstruct the latent vector into a close fascimilie of the original 3D representation. Stated another way, the capability to interpret the dimensions of the latent vector, and to decode the values within those dimensions, may be learned by the decoder. In summary, the encoder and decoder neural network modules are trained to perform the mapping of a 3D representation into a latent vector, which may then be mapped back (or otherwise reconstructed) into a 3D representation that is substantially similar to an original 3D representation for which the latent vector was generated. [00120] Returning to loss calculation, examples of loss calculation may include KL-divergence loss, reconstruction loss or other losses disclosed herein. Representation learning may reduce the size of the dataset required for training a model, because the representation model learns the representation, enabling the generative network to focus on learning the generative task. The result may be improved model generalization because meaningful neural network features of the input data (e.g., local and/or global features) are made available to the generative network. Stated another way, a first network may learn the representation, and a second network may make the predictive decision. By training two networks to perform their own separate tasks, each of the networks may generate more accurate results for their respective tasks than with a single network which is trained to both learn a representation and make a decision. In some instances, transfer learning may first train a representation generation model. That representation generation model (in whole or in part) may then be used to pre-train a subsequent model, such as a generative model (e.g., that generates transform predictions). A representation generation model may benefit from taking mesh element features as input, to improve the capability of a second ML module to encode the structure and/or shape of the inputted 3D oral care representations in the training dataset. [00121] One or more of the neural networks models of this disclosure may have attention gates integrated within. Attention gate integration provides the enhancement of enabling the associated neural network architecture to focus resources on one or more input values. In some implementations, an attention gate may be integrated with a U-Net architecture, with the advantage of enabling the U-Net to focus on certain inputs, such as input flags which correspond to teeth which are meant to be fixed (e.g,. prevented from moving) during orthodontic treatment (or which require other special handling). An attention gate may also be integrated with an encoder or with an autoencoder (such as VAE or capsule autoencoder) to improve predictive accuracy, in accordance with aspects of this disclosure. For example, attention gates can be used to configure a machine learning model to give higher weight to aspects of the data which are more likely to be relevant to correctly generated outputs. As such, and because a machine learning model configured with these attention gates (or mechanisms) utilizes aspects of the data that are more likely to be relevant to correctly generated outputs, the ultimate predictive accuracy of those machine learning models is improved. [00122] The quality and makeup of the training dataset for a neural network can impact the performance of the neural network in its execution phase. Dataset filtering and outlier removal can be advantageously applied to the training of the neural networks for the various techniques of the present disclosure (e.g., for the prediction of final setups or intermediate staging, for mesh element labeling or a neural network for mesh in-filling, for tooth reconstruction, for 3D mesh classification, etc.), because dataset filtering and outlier removal may remove noise from the dataset. And while the mechanism for realizing an improvement is different than using attention gates, that ultimate outcome is that this approach allows for the machine learning model to focus on relevant aspects of the dataset, and may lead to improvements in accuracy similar to improvements in accuracy realized vis-à-vis attention gates. [00123] In the case of a neural network configured to predict a final setup, a patient case may contain at least one of a set of segmented tooth meshes for that patient, a mal transform for each tooth, and/or a ground truth setup transform for each tooth. In the case of a neural network to predict a set of intermediate stage setups, a patient case may contain at least one of a set of segmented tooth meshes for that patient, a mal transform for each tooth, and/or a set of ground truth intermediate stage transforms for each tooth. In some implementations, a training dataset may exclude patient cases which contact passive stages (i.e., stages where the teeth of an arch do not move). In some implementations, the dataset may exclude cases where passive stages exist at the end of treatment. In some implementations, a dataset may exclude cases where overcrowding is present at the end of treatment (i.e., where the oral care provider, such as an orthodontist or dentist) has chosen a final setup where the tooth meshes overlap to some degree. In some implementations, the dataset may exclude cases of a certain level (or levels) of difficulty (e.g., easy, medium and hard). [00124] In some implementations, the dataset may include cases with zero pinned teeth (or may include cases where at least one tooth is pinned). A pinned tooth may be designated by a technician as they design the treatment to stop the various tools from moving that particular tooth. In some implementations, a dataset may exclude cases without any fixed teeth (conversely, where at least one tooth is fixed). A fixed tooth may be defined as a tooth that shall not move in the course of treatment. In some implementations, a dataset may exclude cases without any pontic teeth (conversely, cases in which at least one tooth is pontic). A pontic tooth may be described as a “ghost” tooth that is represented in the digital model of the arch but is either not actually present in the patient’s dentition or where there may be a small or partial tooth that may benefit from future work (such as the addition of composite material through a dental restoration appliance). The advantage of including a pontic tooth in a patient case is to leave space in the arch as a part of a plan for the movements of other teeth, in the course of orthodontic treatment. In some instances, a pontic tooth may save space in the patient’s dentition for future dental or orthodontic work, such as the installation of an implant or crown, or the application of a dental restoration appliance, such as to add composite material to an existing tooth that is too small or has an undesired shape. [00125] In some implementations, the dataset may exclude cases where the patient does not meet an age requirement (e.g., younger than 12). In some implementations, the dataset may exclude cases with interproximal reduction (IPR) beyond a certain threshold amount (e.g., more than 1.0 mm). The dataset to train a neural network to predict setups for clear tray aligners (CTA) may exclude patient cases which are not related to CTA treatment. The dataset to train a neural network to predict setups for an indirect bonding tray product may exclude cases which are not related to indirect bonding tray treatment. In some implementations, the dataset may exclude cases where only certain teeth are treated. In such implementations, a dataset may comprise of only cases where at least one of the following are treated: anterior teeth, posterior teeth, bicuspids, molars, incisors, and/or cuspids. [00126] The mesh comparison module may compare two or more meshes, for example for the computation of a loss function or for the computation of a reconstruction error. Some implementations may involve a comparison of the volume and/or area of the two meshes. Some implementations may involve the computation of a minimum distance between corresponding vertices/faces/edges/voxels of two meshes. For a point in one mesh (vertex point, mid-point on edge, or triangle center, for example) compute the minimum distance between that point and the corresponding point in the other mesh. In the case that the other mesh has a different number of elements or there is otherwise no clear mapping between corresponding points for the two meshes, different approaches can be considered. For example, the open-source software packages CloudCompare and MeshLab each have mesh comparison tools which may play a role in the mesh comparison module for the present disclosure. In some implementations, a Hausdorff Distance may be computed to quantify the difference in shape between two meshes. The open-source software tool Metro, developed by the Visual Computing Lab, can also play a role in quantifying the difference between two meshes. The following paper describes the approach taken by Metro, which may be adapted by the neural networks applications of the present disclosure for use in mesh comparison and difference quantification: "Metro: measuring error on simplified surfaces" by P. Cignoni, C. Rocchini and R. Scopigno, Computer Graphics Forum, Blackwell Publishers, vol. 17(2), June 1998, pp 167-174. [00127] Some techniques of this disclosure may incorporate the operation of, for one or more points on the first mesh, projecting a ray normal to the mesh surface and calculating the distance before that ray is incident upon the second mesh. The lengths of the resulting line segments may be used to quantify the distance between the meshes. According to some techniques of this disclosure, the distance may be assigned a color based on the magnitude of that distance and that color may be applied to the first mesh, by way of visualization. [00128] A 3D representation may be produced using a 3D scanner, such as an intraoral scanner, a computerized tomography (CT) scanner, ultrasound scanner, a magnetic resonance imaging (MRI) machine or a mobile device which is enabled to perform stereophotogrammetry. A 3D representation may describe the shape and/or structure of a subject. A 3D representation may include one or more 3D mesh, 3D point cloud, and/or a 3D voxelized representation, among others. A 3D mesh includes edges, vertices, or faces. Though interrelated in some instances, these three types of data are distinct. The vertices are the points in 3D space that define the boundaries of the mesh. These points would alternatively be described as a point cloud but for the additional information about how the points are connected to each other, as described by the edges. An edge is described by two points and can also be referred to as a line segment. A face is described by a number of edges and vertices. For instance, in the case of a triangle mesh, a face comprises three vertices, where the vertices are interconnected to form three contiguous edges. Some meshes may contain degenerate elements, such as non-manifold mesh elements, which may be removed, to the benefit of later processing. Other mesh pre-processing operations are possible in accordance with aspects of this disclosure.3D meshes are commonly formed using triangles, but may in other implementations be formed using quadrilaterals, pentagons, or some other n-sided polygon. In some implementations, a 3D mesh may be converted to one or more voxelized geometries (i.e., comprising voxels), such as in the case that sparse processing is performed. The techniques of this disclosure which operate on 3D meshes may receive as input one or more tooth meshes (e.g., arranged in one or more dental arches). Each of these meshes may undergo pre-processing before being input to the predictive architecture (e.g., including at least one of an encoder, decoder, pyramid encoder-decoder and U-Net). This pre-processing may include the conversion of the mesh into lists of mesh elements, such as vertices, edges, faces or in the case of sparse processing - voxels. For the chosen mesh element type or types, (e.g., vertices), feature vectors may be generated. In some examples, one feature vector is generated per vertex of the mesh. Each feature vector may contain a combination of spatial and/or structural features, as specified in the following table: Element Spatial Features Structural Features Edges XYZ position of an edge Edge curvature (depends on a midpoint, XYZ positions of the connectivity neighborhood, edge vertices, or the normal average curvature of two vector at an edge midpoint vertices), dihedral angles, edge (average of the normal vectors length, density measure such as of two vertices). a count of incident edges (i.e., a count of the other neighboring edges which share the vertices of that edge). Faces XYZ position of a face centroid, Face curvature (average surface normal vector. curvature of the vertices of the face), face area, density measure such as count of adjacent faces (i.e., which share at least one edge with the face). Points XYZ position Density measure such as the count of neighboring points within a radius of the point Vertices XYZ position, normal vector Vertex curvature, density (weighted average of the normal measure such as the count of vectors of the connecting faces vertices within a radius of the for the vertex). vertex, density measure such as the count of incident edges. Voxels XYZ centroid. Volume, [height x depth x width] dimensions, density measure such as a count of contained vertices, density measure such as count of intersected faces, density measure such as count of intersected edges. Table 1 [00129] Table 1 discloses non-limiting examples of mesh element features. In some implementations, color (or other visual cues/identifiers) may be considered as a mesh element feature in addition to the spatial or structural mesh element features described in Table 1. As used herein (e.g., in Table 1), a point differs from a vertex in that a point is part of a 3D point cloud, whereas a vertex is part of a 3D mesh and may have incident faces or edges. A dihedral angle (which may be expressed in either radians or degrees) may be computed as the angle (e.g., a signed angle) between two connected faces (e.g., two faces which are connected along an edge). A sign on a dihedral angle may reveal information about the convexity or concavity of a mesh surface. For example, a positively signed angle may, in some implementations, indicate a convex surface. Furthermore, a negatively signed angle may, in some implementations, indicate a concave surface. To calculate the principal curvature of a mesh vertex, directional curvatures may first be calculated to each adjacent vertex around the vertex. These directional curvatures may be sorted in circular order (e.g., 0, 49, 127, 210, 305 degrees) in proximity to the vertex normal vector and may comprise a subsampled version of the complete curvature tensor. Circular order means: sorted in by angle around an axis. The sorted directional curvatures may contribute to a linear system of equations amenable to a closed form solution which may estimate the two principal curvatures and directions, which may characterize the complete curvature tensor. Consistent with Table 1, a voxel may also have features which are computed as the aggregates of the other mesh elements (e.g., vertices, edges and faces) which either intersect the voxel or, in some implementations, are predominantly or fully contained within the voxel. Rotating the mesh may not change structural features but may change spatial features. And, as described elsewhere in this disclosure, the term “mesh” should be considered in a non- limiting sense to be inclusive of 3D mesh, 3D point cloud and 3D voxelized representation. In some implementations, apart from mesh element features, there are alternative methods of describing the geometry of a mesh, such as 3D keypoints and 3D descriptors. Examples of such 3D keypoints and 3D descriptors are found in TONIONI A, et al. in Learning to detect good 3D keypoints, Int J Comput. Vis. 2018 Vol .126, pages 1-20.3D keypoints and 3D descriptors may, in some implementations, describe extrema (either minima or maxima) of the surface of a 3D representation. In some implementations, one or more mesh element features may be computed, at least in part, via deep feature synthesis (DFS), e.g. as described in: J. M. Kanter and K. Veeramachaneni, Deep feature synthesis: Towards automating data science endeavors, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2015, pp.1-10, doi: 10.1109/DSAA.2015.7344858. [00130] Representation generation neural networks based on autoencoders, U-Nets, transformers, other types of encoder-decoder structures, convolution and pooling layers, or other models may benefit from the use of mesh element features. Mesh element features may convey aspects of a 3D representation’s surface shape and/or structure to the neural network models of this disclosure. Each mesh element feature describes distinct information about the 3D representation that may not be redundantly present in other input data that are provided to the neural network. For example, a vertex curvature may quantify aspects of the concavity or convexity of the surface of a 3D representation which would not otherwise be understood by the network. Stated differently, mesh element features may provide a processed version of the structure and/or shape of the 3D representation, data that would not otherwise be available to the neural network. This processed information is often more accessible, or more amenable for encoding by the neural network. A system implementing the techniques disclosed herein has been utilized to run a number of experiments on 3D representations of teeth. For example, mesh element features have been provided to a representation generation neural network which is based on a U-Net model, and also to a representation generation model based on a variational autoencoder with continuous normalizing flows. Based on experiments, it was found that systems using a full complement of mesh element features (e.g., “XYZ” coordinates tuple, “Normal vector”, “Vertex Curvature”, Points- Pivoted, and Normals-Pivoted) were at least 3% more accurate than systems that did not. Points-Pivoted describes “XYZ” coordinates tuples that have local coordinate systems (e.g., at the centroid of the respective tooth). Normals-Pivoted describes “Normal Vectors” which have local coordinate systems (e.g., at the centroid of the respective tooth). Furthermore, training converges more quickly when the full complement of mesh element features are used. Stated another way, the machine learning models trained using the full complement of mesh element features tended to be more accurate more quickly (at earlier epochs) than systems which did not. For an existing system observed to have a historical accuracy rate of 91% accurate, an improvement in accuracy of 3% reduces the actual error rate by more than 30%. [00131] Predictive models which may operate on feature vectors of the aforementioned features include but are not limited to: GDL Setups, RL Setups, VAE Setups, Capsule Setups, MLP Setups, Diffusion Setups, PT Setups, Similarity Setups, Tooth Classification, Setups Classification, Setups Comparison, VAE Mesh Element Labeling, MAE Mesh In-filling, Mesh Reconstruction Autoencoder, Validation Using Autoencoders, Mesh Segmentation, Coordinate System Prediction, Mesh Cleanup, Restoration Design Generation, Appliance Component Generation and Placement, and Archform Prediction. Such feature vectors may be presented to the input of a predictive model. In some implementations, such feature vectors may be presented to one or more internal layers of a neural network which is part of one or more of those predictive models. [00132] Orthodontic metrics, such as those described elsewhere in this disclosure, may be computed to quantify the state of a setup (i.e., arrangement of teeth). These metrics may quantify the extent of malocclusion in a patient’s dentition before the start of orthodontic treatment. These metrics may also quantify the gradual improvement in the patient’s dentition over the course of treatment, for example, over the course of successive stages of clear tray aligner treatment. These metrics may also quantify the final arrangement of the patient’s teeth, as a measurement or demonstration of the improved nature of the dentition in that final state. Some implementations of optimization algorithms, such as the neural networks of the instant disclosure or the techniques described in WO2020026117A1, may compute orthodontic metrics through the course of operation, as a means of quantifying progress and guiding the optimization algorithm towards an end state (e.g., the prediction of a final setup). [00133] There may be dozens or even hundreds of metrics to visualize, each of which exists along its own coordinate axis. The present technique uses a dimensionality reduction technique, such as the tSNE or U-Map technique, for dimensionality reduction of orthodontic metrics data. The techniques of this disclosure may, in some examples, further implement visualization techniques that highlight the results of the dimensionality reduction. Some implementations may further implement the plotting of the progress of an orthodontic treatment plan, over many successive stages. Such a plot may, in some instances, be used by a clinician or technician to assess the appropriateness of a treatment plan. [00134] Metrics-based automation techniques are described herein for the creation of final setups and intermediate staging (e.g., as also discussed under the “Orthodontic Metrics Calculation” heading). These automation techniques may, in some implementations, quantify the relative positions and orientations of at least one tooth relative to at least one other tooth using metrics. Some metrics may, for example, quantify the degree of alignment (or misalignment) of the teeth in one or more arches of teeth. This technique pertains to the visualization of those metrics, in the form of high-dimensional vectors which may be reduced in dimensionality, to make the data more useful in the production of clear tray aligners or other oral care appliances. The technique may also apply to the visualization of other types of metrics in digital dentistry and digital orthodontics, such as for metrics which may be used in the automation of dental restoration appliance creation (i.e., for the 3M® Filtek™ Matrix), such as Restoration Design Metrics (RDM) (e.g., as described under the “Restoration Design Metrics Calculation” heading herein). [00135] Some implementations of the techniques of the present disclosure involve dimensionality reduction of these high-dimensional vectors of metrics, so that the vectors may be visualized in two dimensions (2D) or three dimensions (3D), for interpretation by a clinician. Techniques of this disclosure may also chart the progress of a 3M® Clarity™ aligners case, plotting points in this low-dimension space (i.e., 2D or 3D) which show the progress of successive stages of treatment. [00136] The 3M CLARITY aligners product may be used in orthodontic treatment and may involve a succession of clear plastic trays. Each tray may move the patient's teeth by an increment, until at the end of treatment, the patient's teeth may have been moved into the target positions and orientations. The 3M® Filtek™ Matrix product may be a custom 3D printed mold which may be used to shape dental composite, to create veneers. [00137] An orthodontic setup may describe the position and/or orientation transforms which are to be applied to each tooth of a set of teeth, to bring those teeth into the positions which are desired at the conclusion of orthodontic treatment (according to a final setup). US20210259808A1 and US20220249201A1 describe metrics which may be used to quantify an orthodontic setup. These physical characteristics may show how near a setup is to the ideal or aimed-for characteristics. These metrics may include but are not limited to how well aligned teeth are to their neighboring teeth or to the archform, as well as the degree of crossbite or overbite evident in the arch. Dozens of metrics may be computed in the course of an automated final setup or intermediate stage design process. [00138] Techniques of this disclosure enable these many metrics to be visualized in a form (e.g., in a 2D plot) that enables a clinician to make decisions in patient treatment which improve treatment outcomes in a 3M CLARITY aligners case or in a 3M® Filtek™ Matrix case. [00139] Numerous types of metrics may be computed for an arch or a for a pair of arches. This set of scalar valued metrics may be represented as a vector. The vector may describe the suitability of the dental setup for use in creating clear tray aligners. There is potential value in being able to plot this vector of scalar metrics values, which may be plotted in an n-dimensional space (e.g., a 470-dimensional space from one implementation). The set of metrics values corresponding to a mal setup, which reflects the starting arrangement of the teeth may be plotted in this space, and each successive intermediate setup may also be plotted in this space. Finally, the final setup (which reflects the target or end-point of treatment) may be plotted in this space. In one example, the vector of metrics values for a setup may comprise 470 individual scalar values. Plotting a point relative to 470 orthogonal coordinate axes may produce a result which is perfectly understandable to a computer, but a human may struggle to understand such a plot (because the plot would likely have to be displayed to the human in incremental steps, including 2 or 3 axes at a time). The techniques of this disclosure implement dimensionality reduction to make a plot of metrics data (e.g., sometimes in the order of 470 axes) to enable better interpretation and analysis by a clinician. [00140] Dimensionality reduction techniques in accordance with this disclosure may include one or more of T-distributed Stochastic Neighborhood Embedding (t-SNE), Uniform Manifold Approximation and Projection (U-Map), CompressionVAE (CVAE), principal component analysis (PCA), MDS, sammon mapping, and/or graph-based techniques. t-SNE, U-Map and other techniques may preserve neighborhood embeddings, which may transform each data point from a high dimensional dataset into a lower dimensional space (e.g., 2 or 3 dimensions). t-SNE may preserve clusters of data points when this dimensionality reduction operation occurs, meaning that if two points are within the same cluster in the high-dimensional space, then those points may be in the same cluster in the low dimensional space. [00141] Techniques of the present disclosure may use U-Map, tSNE and/or other dimensionality reduction techniques to visualize the metrics values which are computed for an orthodontic setup. One advantageous aspect of these visualizations may be the emergence of visual clusters of datapoints representing similar arches, which appear as 1D, 2D or 3D plots, which are made after the n-dimensions (e.g., n = 470) are reduced to 1, 2 or 3 dimensions, respectively. The technique described above may enable the representation of high dimensional data as data of arbitrarily lower dimensionality. Selecting this representation dimensionality to be between one (1D) and three dimensions (3D) may enable visualization of complex data in a manner that is easily understood. [00142] An illustration of progress from Mal to Setup using a single orthodontic metric (e.g., Alignment, indicating how well aligned the teeth are with the arch) can be seen in FIG 2. FIG.2 is a graph that plots the decrease in alignment error over the course of orthodontic treatment for a patient case. [00143] The present technique’s use of UMAP may generate a single representation of progress by combining multiple (perhaps 100’s of) individual metrics into a single visualization. The direction of progress of a given arch toward Setup may be illustrated by the black parallel lines in FIG.3. As shown in FIG.3, after dimensionality reduction is applied, a 2D visualization of metrics for cases in a dataset can be formed. [00144] A clinician who is working to design the intermediate and final setups for patient’s orthodontic treatment may examine such a plot (e.g., FIG.4) and see a progression of the current arch characteristics towards the region of the plot where setups normally reside, which illustrates the progress of treatment through the course of a succession of intermediate stages, leading to the final setup. The data precision and resource efficiency of the treatment planning process may benefit from this visualization technique, also potentially resulting in an improved patient outcome. [00145] In some implementations, a path (either a linear segment or a path along the surface of a manifold) may be drawn on this reduced dimensionality plot (i.e., a 1D, 2D, or 3D plot), and each successive intermediate stage be projected onto that line. The projected points which result from these projections show the progress of treatment towards the final setup. The position of each projected point on the line between Mal and Setup may be used to calculate a percentage score representing how complete a treatment is (i.e., showing how far from Mal and towards setup the point is located). A clinician may use this plot to determine whether a series of intermediate stages is progressing from the mal tooth configuration towards the intended final setup tooth configuration. The techniques of this disclosure may provide data precision-related technical improvements in the form of an improved aligner production process, by enabling quick and intuitive validation of the intermediate staging progression. This approach may also provide a summary of the progress towards an automated system which may be used for monitoring or interacting with the setup design across one or multiple cases. As shown in FIG. 4, after dimensionality reduction is applied, a 2D visualization of metrics for cases in a dataset can be formed. [00146] The progress percentage for the patient case from FIG.4 is illustrated in FIG.5. In this example, the Mal is stage 0 on the x-axis, and the setup is stage 28 on the x-axis. A clear trend upwards is visible in FIG. 5. FIG.5 shows an example of progress of an orthodontic case through the stages of treatment. [00147] In one example, FIG.5 shows various stages of treatment “progress” for an aligners case from Mal to Setup, on a stage-by-stage basis. The "progress" metric (plotted on the vertical or y-axis) is a low dimensional representation of the approximately 450-dimensional metrics vector, where each of the 450 elements in the vector are derived from an independent clinical metric relating to a tooth’s position and orientation in the arch. The 450 metrics may first be reduced to a lower dimensional space using t- sne or UMAP (or another dimensionality reduction technique). A line in this low dimensional space may be defined between the Mal metrics point and the setup metrics point. At each intermediate stage of the treatment a new metrics point may be calculated and projected onto the Mal-Setup line. The distance along this line represents the "progress" score plotted in FIG.5. [00148] Additionally, the number of dimensions in which to represent the data may also be assessed using this technique. By generating many different possible lower dimensional representations, the present technique may compare which representations provide sufficient information. For example, when comparing the results between 2, 3, and 200 dimensions, it may, in some instances, be seen that there is limited gain in extending the representation of the data beyond 3 dimensions as the 3D plot is quite different to the 2D plot, but very similar to the 200-dimension plot (see FIG.7). This example illustrates the ability of this technique of the present disclosure to compare different dimensional representations of data, while retaining the ability to visualize the results regardless of the specific dimensionality chosen. [00149] FIG.7 shows the progress plots for a particular case. The three lines plotted in FIG.7 illustrate how the progress score is affected by the dimensionality of the low-dimensional representation upon which the Mal-Setup line projection is based. A table is given which lists each metric name and supplies the metric value. FIGs 6, 8, 9, and 10 show what some of the metrics may measure in the arch (in these cases, maloccluded arches). These metrics are described elsewhere in this disclosure. A number of example metrics are represented visually in FIGs 6, 8, 9, and 10. FIG.6 shows a tooth-arch alignment metric - a measure of the tooth-arch alignment (direction in red) of a given tooth to the archform (light blue). FIG.8 shows a buccolingual inclination metric - a measure of the buccal-lingual inclination of a tooth. The degree of tilt of the tooth should be within acceptable tolerances. FIG.9 shows a midline metric - a measure of the alignment of the boundary between the central incisors in the upper and lower arches with a desired vertical line on the face. FIG.10 shows an overjet metric - a measure of the horizontal offset (in the posterior-anterior direction) between the central incisors in the upper and lower arches. [00150] Standard machine learning data processing may be performed on the raw metrics data prior to their ingestion by the UMAP or T-SNE algorithms. This includes, but is not limited to, data normalization. The plots may be used for one or more purposes, some of which are described below. Patient treatment may be aided by dimensionality reduction techniques as outlined above by the use of automated systems to score each treatment plan in a plurality of treatment plans, thus serving in an automated quality control capacity. The present technique may provide a framework by which the characteristics of the arch can be represented quantitatively rather than qualitatively, allowing for automated systems such as AI and Machine Learning based systems to learn facets of what constitutes good arches and setups. This numerical quantification of the quality may enable the use of generative applications, such as teaching models how to generate high quality arches. In such approaches, the unified metrics score may be used to provide feedback to an AI model (e.g., via backpropagation) as the model learns, thereby enabling the model to identify when the model is improving. In turn, this may provide scenarios in which the model can be encouraged (e.g., via reinforcement learning techniques) to make better predictions in the future. The present technique may be used to enhance loss calculation in the training of an oral care-related predictive model. The metrics visualization techniques of the present disclosure may, in some implementations, be used to visualize one or more restoration design metrics (RDM). [00151] Representation generation neural networks based on autoencoders, U-Nets, transformers, other types of encoder-decoder structures, convolution and/or pooling layers, or other models may benefit from the use of oral care arguments (e.g., oral care metrics or oral care parameters). For example, oral care metrics (e.g., orthodontic metrics or restoration design metrics) may convey aspects of the shape and/or structure of the patient’s dentition (e.g., the shape and/or structure of an individual tooth, or the special relationships between two or more teeth) to the neural network models of this disclosure. Each oral care metric describes distinct information about the patient’s dentition that may not be redundantly present in other input data that are provided to the neural network. For example, an “Overbite” metric may quantify the overlap between the upper and lower central incisors along the vertical Z-axis, information which may not otherwise, in some implementations, be readily ascertainable by a traditional neural network. Stated another way, the oral care metrics provide refined information about the patient’s dentition that a traditional neural network (e.g., a representation generation neural network) may not be adequately trained or configured to extract the oral care metrics described herein. However, a neural network which is specifically trained to generate oral care metrics may overcome such a shortcoming, because, for example loss may be computed in such a way as to facilitate accurate oral care metrics prediction. Mesh oral care metrics may provide a processed version of the structure and/or shape of the patient’s dentition, data which may not otherwise be available to the neural network. This processed information is often more accessible, or more amenable for encoding by the neural network. A system implementing the techniques disclosed herein has been utilized to run a number of experiments on 3D representations of teeth. For example, oral care metrics have been provided to a representation generation neural network which is based on a U-Net model. Based on experiments, it was found that systems using oral care metrics (e.g., “Overbite”, “Overjet” and “Canine Class Relationship” metrics) were at least 2.5% more accurate than systems that did not. Furthermore, training converges more quickly when the oral care metrics are used. Stated another way, the machine learning models trained using oral care metrics tended to be more accurate more quickly (at earlier epochs) than systems which did not. For an existing system observed to have a historical accuracy rate of 91% accurate, an improvement in accuracy of 2.5% reduces the actual error rate by almost 30%. [00152] Examples of oral care metrics include Orthodontic Metrics (OM) and Restoration Design Metrics (RDM). RDM may describe the shape and/or form of one or more 3D representations of teeth for use in dental restoration. One use case example is in the creation of one or more dental restoration appliances. Another use case example is in the creation of one or more veneers (such as a zirconia veneer). Some RDM may quantify the shape and/or other characteristics of a tooth. Other RDM may quantify relationships (e.g., spatial relationships) between two or more teeth. RDM differ from restoration design parameters (RDP) in that restoration design metrics define a current state of a patient's dentition, whereas restoration design parameters serve as specifications to a machine learning or other optimization model to generate desired tooth shapes and/or forms. RDM describe the shapes of the teeth currently (e.g., in a starting or mal condition). Restoration design parameters specify how an oral care provider (such as a dentist or dental technician) intends for the teeth to look after the completion of restoration treatment. Either or both of RDM and RDP may be provided to a neural network or other machine learning or optimization algorithm for the purpose of dental restoration. In some implementations, RDM may be computed on the pre-restoration dentition of the patient (i.e., the primary implementation). In other implementations, RDM may be computed on the post-restoration dentition of the patient. A restoration design may comprise one or more teeth and may be referred to as a restoration arch. Restoration design generation may involve the generation of an improved geometry and/or structure of one or more teeth in a restoration arch. [00153] Aspects of RDM calculation are described below. In some implementations, RDM may be measured, for example, through locating landmarks in the teeth (or gums, hardware and/or other elements of the patient's dentition), and the measurements of distances between those landmarks, or otherwise made in relation to those landmarks. In some implementations, one or more neural networks or other machine learning models may be trained to identify or extract one or more RDM from one or more 3D representations of teeth (or gums, hardware and/or other elements of the patient's dentition). Techniques of this disclosure may use RDM in various ways. For instance, in some implementations, one or more neural networks or other machine learning models may be trained to classify or label one or more setups, arches, dentitions or other sets of teeth based at least in part on RDM. As such, in these examples, RDMs form a part of the training data used for training these models. [00154] Aspects of a tooth mesh reconstruction autoencoder may be used in accordance with techniques of this disclosure are described below. An autoencoder for restoration design generation is disclosed in US Provisional Application No. US63/366514. This autoencoder (e.g., a variational autoencoder or VAE) takes as input a tooth mesh (or other 3D representation) that reflects a mal state (i.e., the pre-restoration tooth shape). The encoder component of the autoencoder encodes that tooth mesh to a latent form (e.g., a latent vector). Modifications may be applied to this latent vector (e.g., based on a mapping of the latent space through prior experiments), for the purpose of altering the geometry and/or structure of the eventual reconstructed mesh. Additional vectors may, in some implementations, be included with the latent vector (e.g., through concatenation), and the resulting concatenation of vectors may be reconstructed by way of the decoder component of the autoencoder into a reconstructed tooth mesh which is a facsimile of the input tooth mesh. [00155] RDM and RDP may also be used as neural network inputs in the execution phase, in accordance with aspects of this disclosure. In some implementations, one or more RDM may be concatenated with the input to the encoder, for the purpose of telling the encoder specific information about the input 3D tooth representation. In some implementations, one or more RDM may be concatenated with the latent vector, before reconstruction, for the purpose of providing the decoder component with specific information about the input 3D tooth representation. Furthermore, in some implementations, one or more restoration design parameters (RDP) may be concatenated with the input to the encoder component, for the purpose of providing the encoder specific information about the input 3D tooth representation. Likewise, in some implementations, one or more restoration design parameters (RDP) may be concatenated with the latent vector, before reconstruction, for the purpose of providing the decoder specific information about the input 3D tooth representation. [00156] In this way, either or both of RDM and RDP may be introduced to the functioning of an autoencoder (e.g., a tooth reconstruction autoencoder), and serve to influence the geometry and/or structure of the reconstructed restoration design (i.e., influence the shape of the tooth on the output of the autoencoder). In some implementations, the variational autoencoder of US Provisional Application No. US63/366514 may be replaced by a capsule autoencoder (e.g., instead of encoding the tooth mesh into a latent vector, the tooth mesh is encoded into one or more latent capsules). [00157] In some implementations, clustering or other unsupervised techniques may be performed on RDM to cluster one or more setups, arches, dentitions or other sets of teeth based on the restoration characteristics of the teeth. Such clusters may be useful in treatment planning, as the clusters provide insight into categories of patients with different treatment needs. This information may be instructive to clinicians as they learn about possible treatment options. In some instances, best practices may be identified (such as default RDP values) for patient cases that fall into one or another cluster (e.g., as determined by a similarity measure, as in k-NN). After a new case is classified into a particular cluster, information about the relevant best practices may be provided to the clinician who is responsible for processing the case. Such default values may, in some instances, undergo further tuning or modifications. [00158] Case Assignment: Such clusters may be used to gain further insight into the kinds of patient cases which exist in a dataset. Analysis of such clusters may reveal that patient treatment cases with certain RDM values (or ranges of values) may take less time to treat (or alternatively more time to treat). Cases which take more time to treat (or are otherwise more difficult) may be assigned to experienced or senior technicians for processing. Cases which take less time to treat may be assigned to newer or less- experienced techniques for processing. Such an assignment may be further aided by finding correlations between RDM values for certain cases and the known processing durations associated with those cases. [00159] The following RDM may be measured and used in the creation of either or both of dental restoration appliances and veneers {veneers are a type of dental restoration appliance}, with the objective of making the resulting teeth natural looking. Symmetry is generally a preferred facet. There may be differences between patients based on demographic differences. The generation of dental restoration appliances may benefit from some or all of the following RDM. Shade and translucency may pertain, in particular, to the creation of veneers, though some implementations of dental restoration appliances may also consider this information. [00160] Techniques of this disclosure may, in some implementations, obtain on or more of the following examples of data for use in an oral care metrics calculation (e.g., orthodontic metrics or RDM): 1) a digital 3D model of one or more teeth in a pre-restoration state; 2) a digital 3D model one or more teeth in a post-restoration state; 3) a digital 3D model of one or more neighboring teeth in a pre- restoration state; 4) a digital 3D model of one or more neighboring teeth in a post-restoration state; 5) position information associated with one or more neighboring teeth from the 3D digital model; or 6) landmark information associated with one or more neighboring teeth from the 3D digital model. [00161] Non-limiting examples of inter-tooth RDM are enumerated below. [00162] 1) Bilateral Symmetry and/or Ratios: A measure of the symmetry between one or more teeth and one or more other teeth on opposite sides of the dental. For example, for a pair of corresponding teeth, a measure of the width of each tooth. In one instance, the one tooth is of normal width, and the other tooth is too narrow. In another instance, both teeth are of normal width. The following is a list of attributes that can be measured for a tooth, and compared to the corresponding measurement for one or more corresponding teeth: a) width - mesial to distal distance; b) length - gingival to incisal distance; c) diagonal - distance across the tooth, e.g., from the mesial gingival corner to the distal incisal corner (this measure is one of many that can be used to quantify the shape of teeth beyond length and width). Ratios between a and b may be computed, such as a/b or b/a. Such ratios can be indicative of whether spatial symmetry exists (e.g., by measuring the ratio a/b on the left side and measuring the ratio a/b on the right side, then compare the left and right ratios). In some implementations, where spatial symmetry is "off", the length, width and/or ratios may not match. Such a ratio may, in some implementations, be computed relative to a standard. A number of esthetic standards are available in the dental literature. Examples include Golden Proportion and Recurring Esthetic Dental Proportion. In some implementations, spatial symmetry may be measured on a pair of teeth, where one tooth is on the right side of the arch, and the other tooth is on the left side of the arch. [00163] 2) Proportions of Adjacent Teeth: Measure the width proportions of adjacent teeth as measured as a projection along an arch onto a plane (e.g., a plane that is situated in front of the patient's face). The ideal proportions for use in the final restoration design can be, for example, the so-called golden proportions. The golden proportions relate adjacent teeth, such as central incisors and lateral incisors. This metric pertains to the measuring of these proportions as the proportions exist in the pre- restoration mal dentition. The ideal golden proportions are 1.6, 1, 0.6, for the central incisor, lateral incisor and cuspid, on a particular side (either left or right) for a particular arch (e.g., the upper arch). If one or more of these proportion values is off (e.g., in the case of "peg laterals"), the patient may wish for dental restoration treatment to correct the proportions. [00164] 3) Arch Discrepancies: A measure of any size discrepancies between the upper arch and lower arch, for example, pertaining to the widths of the teeth, for the purpose of dental restoration. For example, techniques of this disclosure may make adjacent tooth width proportion measurements in the upper arch and in the lower arch. In some implementations, Bolton analysis measurements may be made by measuring upper widths, lower widths, and proportions between those quantities. Arch discrepancies may be described in absolute measurements (e.g., in mm or other suitable units) or in terms of proportions or ratios, in various implementations. [00165] 4) Midline: A measure of the midline of the maxillary incisors, relative to the midline of the mandibular incisors. Techniques of this disclosure may measure the midline of the maxillary incisors, relative to the midline of the nose (if data about nose location is available). [00166] 5) Proximal Contacts: A measure of the size (area, volume, circumference, etc.) of the proximal contact between adjacent teeth. In the ideal circumstance, the teeth touch along the mesial/distal surfaces and the gums fill in gingivally to where the teeth touch. Black triangles may form if the gum tissue fails to fill the space below the proximal contact. In some instances, the size of the proximal contact may get progressively shorter for teeth located farther towards the posterior of the arch. In an ideal scenario, the proximal contact would be long enough so that there is an appropriately sized incisal embrasure and the gum tissue fills in the area below or gingival to the contact. [00167] 6) Embrasure: In some implementations, techniques of this disclosure may measure the size (area, volume, circumference, etc.) of an embrasure, the gap between teeth at either of the gingival or incisal edge. In some implementations, techniques of this disclosure may measure the symmetry between embrasures on opposite sides of the arch. An embrasure is based at least in part on the length of the length of the contact between teeth, and/or at least in part on the shape of the tooth. In some instances, the size of the embrasure may get progressively longer for teeth located farther towards the posterior of the arch. [00168] Non-limiting examples of Intra-tooth RDM are enumerated below, continuing with the numbering of other RDM listed above. [00169] 7) Length and/or Width: A measure of the length of a tooth relative to the width of that tooth. This metric may reveal, for example, that a patient has long central incisors. Width and length are defined as: a) width - mesial to distal distance; b) length - gingival to incisal distance; c) other dimensions of tooth body - the portions of tooth between the gingival region and the incisal edge. In some implementations, either or both of a length and a width may be measured for a tooth and compared to the length and/or width of one or more teeth. [00170] 8) Tooth Morphology: A measure of the primary anatomy of the tooth shape, such as line angles, buccal contours, and/or incisal angles and/or embrasures. The frequency and/or dimensions may be measured. In some implementations, the observed primary tooth shape aspects may be matched to one or more known styles. Techniques of this disclosure may measure secondary anatomy of the tooth shape, such as mamelon grooves. For instance, the frequency and/or dimensions may be measured. In some implementations, the observed secondary tooth shape aspects may be matched to one or more known styles. In some examples, techniques of this disclosure may measure tertiary anatomy of the tooth shape, such as perikymata or striations. For instance, the frequency and/or dimensions may be measured. In some implementations, the observed tertiary tooth shape aspects may be matched to one or more known styles. [00171] 9) Shade and/or Translucency: A measure of tooth shade and/or translucency. Tooth shade is often described by the Vita Classical or 3D Master shade guide. Tooth translucency is described by transmittance or a contrast ratio. Tooth shade and translucency may be evaluated (or measured) based on one or more of the following kinds of data pertaining to teeth: the incisal edge, incisal third, body and gingival third. The enamel layer translucency is general higher than the dentin or cementum layer. Shade and translucency may, in some implementations, be measured on a per-voxel (local) basis. Shade and translucency may, in some implementations, be measured on a per-area basis, such as an incisal area, tooth body area, etc. Tooth body may pertain to the portions of the tooth between the gingival region and the incisal edge. [00172] 10) Height of Contour: A measure of the contour of a tooth. When viewed from the proximal view, all teeth have a specific contour or shape, moving from the gingival aspect to the incisal. This is referred to as the facial contour of the tooth. In each tooth, there is a height of contour, where that shape is the most pronounced. This height of contour changes from the teeth in the anterior of the arch to the teeth in the posterior of the arch. In some implementations, this measurement may take the form of fitting against a template of known dimensions and/or known proportions. In some implementations, this measurement may quantify a degree of curvature along the facial tooth surface. In some implementations, measure the location along the contour of the tooth where the height of the curvature is most pronounced. This location may be measured as a distance away from the gingival margin or a distance away from the incisal edge, or a percentage along the length of the tooth. [00173] RDMs may be converted to restoration design scores (RDS) that represent the RDMs’ agreement with or deviation from ideal values in a patient case dataset. These RDS can then be used to rate restoration designs and inform oral care providers and/or automated systems about which RDMs are currently in agreement with good restoration designs, suggesting that the restoration designs do not need to be improved further, and which RDMs are not in agreement with good restoration designs, suggesting that the restoration designs need to be further refined. [00174] First, the network may learn baseline values for each of the RDMs from a ground truth dataset of post-restoration designs. The ideal post-restoration arches may be further refined or possibly tailored to specific practices or oral care providers (e.g., a separate set of ground truth restoration designs for each dentist). All features are computed for each restoration design in the set of ground truth restoration designs. For each RDM, the median, kth percentile and (100−k)th percentile are computed. For concreteness, the k=25 case is described. Once baseline values have been computed, metrics for new restoration designs can be converted to RDS using the approach shown in the pseudo-code below: if rdm_value>ideal_median: rds=(rdm_value−ideal_median)/(ideal_75th_percentile−ideal_median) else: rds=(rdm_value−ideal_median)/(ideal_median−ideal 25th_percentile) [00175] An RDS value between −1 and 1 indicates that the RDM value is within the 25th−75th percentile values of the RDM in the ground truth dataset, while an RDS greater than 1 or less than −1 indicates that the RDM is outside of the normal range in the ground truth dataset. [00176] The RDMs and/or RDSs described above may be used to train a machine learning (ML) classifier to rate restoration designs. First, the classifier learns baseline values for each RDM and/or RDS from a dataset of case data, which includes pre-restoration and post-restoration dentition for each patient. This is followed by an optional normalization step which make the features have 0 mean and unit variance. ML classifiers are subsequently trained using cross validation to identify post-restoration restoration designs. Classifiers may include, for example, Support Vector Machine (SVM), elliptical covariance estimator, or a Principal Components Analysis (PCA) reconstruction error-based classifier, decision trees, random forest, Adaboost classifier, Naïve Bayes, or neural networks (such as those disclosed elsewhere in this disclosure). Other classifiers are also possible, such as classifiers disclosed elsewhere in this disclosure. [00177] RDSs and/or ML classifiers can be used for several tasks during automated restoration design generation in accordance with the techniques of this disclosure, and are presented in enumerated format below: [00178] 1. Selection of One or More Initialization “Target” Restoration Designs from a Set of Candidate Restoration Designs: Initialization “target” restoration designs represent restoration designs that fix some issues with the initial malocclusion but still need to be optimized to result in an adequate restoration designs. These restoration designs may allow for speed-up of the restoration designs search process by allowing for difficult tooth restoration geometries to be generated upfront. Target restoration designs may be generated by performing simple operations (e.g., filling-in template shapes over “stub” teeth). Alternatively, the classifier may learn restoration designs based on data from existing restoration designs. Once a set of target restoration designs has been created, the classifier may select the best restoration design(s) by choosing the restoration design(s) that minimize one or more RDSs, or by choosing the restoration design(s) that a ML model rates as being most representative of a post- restoration restoration design. [00179] 2. Loss Function for Restoration Design Optimization: Restoration designs may be automatically created by iteratively adjusting the geometries and/or structures of one or more teeth in a restoration arch. The loss function can be defined as a single RDM or RDS, a linear or non-linear combination of RDM/RDS, or the continuous output of a ML classifier. Output from a ML classifier of this disclosure may include, for example, the distance to a hyperplane in a SVM, the distance metric in a GMM, or a probability that the state is a member of the pre-restoration or post-restoration class in a two- class classifier. [00180] 3. DRM Selection for Restoration Design Optimization: The DRSs described in Section B indicate the deviation of RDMs from an ideal restoration design. An RDS between −1 and 1 indicates that the RDM lies within ideal data, while values outside this range suggest that the RDM lies outside of ideal values and should be further improved. Thus, RDSs can be used to select RDMs in a restoration design that need to be further optimized. Conversely, RDSs can also be used to identify RDMs that are currently in the acceptable range and that should not be increased during optimization. [00181] 4. Stopping Criteria for Optimization: RDMs, RDSs, or the output of a ML classifier can be used to evaluate the acceptability of a restoration design. If the restoration design lies within an acceptable range, iterative optimization can be terminated. [00182] 5. Selection of Restoration Design from a Set of Candidate Restoration Designs: Restoration Design generation can be designed to produce multiple candidate restoration designs, from which a subset can be selected. The subset may include the single best scoring restoration design or multiple restoration designs that achieve the best RDSs or RDSs above a threshold. Alternatively, RDSs can be used to identify a subset of restoration designs that represent certain tradeoffs in restoration design. [00183] RDMs, RDSs, and/or ML classifiers can be used in interactive tools to assist clinicians during restoration design generation and evaluation. [00184] 1. Interactive Tool for Restoration Design Generation: RDM could be computed and displayed to an oral care provider during interactive restoration design generation. The oral care provider could use clinical expertise to determine which RDM may benefit from further improvement and could perform the appropriate tooth geometry and/or structure modifications to achieve these improvements. Such a tool could also be used to train new oral care providers in how to develop an acceptable restoration design. Alternatively, scores for individual RDMs and/or ML classifier output could be displayed. This would provide information about the severity of issues with restoration design generation. The ML classifier would alert the oral care provider if the restoration design did not closely resemble an ideal post-restoration restoration design. RDSs would indicate which RDMs needed to be further refined to achieve a restoration design which is suitable for use in creating a dental restoration appliance. [00185] 2. Interface for Restoration Design Evaluation: RDMs, RDSs, and/or ML classifier output could be provided to a clinician and patient in a display interface alongside the restoration design. By providing this type of easily interpretable information, the systems of this disclosure may help the patient understand the target restoration design and promote treatment acceptance. [00186] 3. Interface for Dental Restoration Design Selection from Candidate Dental Restoration Designs: RDMs, RDSs, and/or ML classifier output could be used to demonstrate trade-offs between multiple candidate restoration designs. This information could be displayed alongside multiple options for restoration designs along with additional factors (e.g., treatment cost) to enable the patient to make an informed decision about treatment selection. [00187] In some implementations, systems of this disclosure may modify the shape and/or structure of a 3D representation of a tooth (e.g., a 3D mesh of a tooth), based at least in part on the dental restoration metrics and scores described herein. One or more mesh elements of a tooth representation may be modified, including a 3D point (in the case of a 3D point cloud), a vertex, a face, an edge or a voxel (in the case of sparse processing). A mesh element may be removed. A mesh element may be added. A mesh element may undergo modification to that mesh element’s position, to the mesh element’s orientation or to both. In some instances, one or more mesh elements may undergo smoothing in the course of forming the target restoration design. One or more mesh elements may under deformations which are computed, at least in part, in consideration of one or more RDM. [00188] In some instances, one or more RDM may be provided to a machine learning model (e.g., an encoder, a neural network - such as an autoencoder, a U-Net, a transformer or a network comprising convolution and/or pooling layers) which has been trained to generate a 3D oral care representation, such as a restoration tooth design (e.g., to design crown, root or both). Such RDM may impart information about the shape and/or structure of the one or more teeth meshes (or point clouds, etc.) to a neural network which has been trained to generate representations of the teeth, thereby improving those representations. In the instance that an autoencoder (e.g., a variational autoencoder or a capsule autoencoder) is trained to generate a latent representation of a 3D oral care representation (e.g., a dimensionality-reduced latent vector of a tooth mesh), the autoencoder may also be trained to reconstruct the 3D oral care representation out of that latent vector. The reconstructed 3D oral care representation may be compared to the inputted 3D oral care representation using a reconstruction error calculation. In the case of a tooth reconstruction autoencoder, the reconstructed tooth may be compared to the inputted tooth, to show how closely the shape and/or structure of the reconstructed tooth corresponds to the inputted tooth. [00189] FIG.11 shows an example technique, using systems of this disclosure, to generate a tooth restoration design using RDM. A pre-restoration tooth design may be received at step 1102 (e.g., a 3D representation). The pre-restoration tooth design may be provided to a module at step 1104 which may compute one or more RDM (e.g., “Length and/or Width”, “Height of Contour”, or “Tooth Morphology”, among others) on that pre-restoration tooth. At step 1106, a scoring function may be executed on the RDM, according to the descriptions herein. A termination criterion is evaluated at step 1108. Examples of criteria include a maximum number of iterations. Other examples of a termination criteria include evaluation of the RDM and/or score, to determine whether the RDM and/or score are within thresholds or tolerances (e.g., whether the tooth has become sufficiently wide or sufficiently long). If the termination criterion is not yet met at step 1108, then at step 1114 the shape and/or structure of the tooth is modified. After modifying the tooth in step 1114, one or more RDM are updated in step 1112. The technique then iterates as illustrated and described. After the termination criterion is met, the completed restoration design is outputted at step 1110. [00190] Techniques of this disclosure may train an encoder-decoder structure to reconstruct a 3D oral care representation which is suitable for oral care appliance generation. An encoder-decoder structure may comprise at least one encoder or at least one decoder. Non-limiting examples of an encoder-decoder structure include a 3D U-Net, a transformer, a pyramid encoder-decoder or an autoencoder, among others. Non-limiting examples of autoencoders include a variational autoencoder, a regularized autoencoder, a masked autoencoder or a capsule autoencoder. [00191] Techniques described herein may be trained to generate 3D oral care representations (e.g., tooth restoration designs, appliance components, and other examples of 3D oral care representations described herein). Such 3D oral care representations may comprise point clouds, polylines, meshes, voxels and the like. Such 3D oral care representation may be generated according to the requirements of the oral care arguments which may, in some implementations, be supplied to the generative model. Oral care arguments may include oral care parameters as disclosed herein, or other real-valued, text-based or categorical inputs which specify intended aspects of the one or more 3D oral care representations which are to be generated. In some instances, oral care arguments may include oral care metrics, which may describe intended aspects of the one or more 3D oral care representations which are to be generated. Oral care arguments are specifically adapted to the implementations described herein. For example, the oral care arguments may specify the intended the designs (e.g., including shape and/or structure) of 3D oral care representations which may be generated (or modified) according to techniques described herein. In short, implementations using the specific oral care arguments disclosed herein generate more accurate 3D oral care representations than implementations that do not use the specific oral care arguments. In some instances, a text encoder may encode a set of natural language instructions from the clinician (e.g., generate a text embedding). A text string may comprise tokens. An encoder for generating text embeddings may, in some implementations, apply either mean-pooling or max-pooling between the token vectors. In some instances, a transformer (e.g., BERT or Siamese BERT) may be trained to extract embeddings of text for use in digital oral care (e.g., by training the transformer on examples of clinical text, such as those given below). In some instances, such a model for generating text embeddings may be trained using transfer learning (e.g., initially trained on another corpus of text, and then receive further training on text related to digital oral care). Some text embeddings may encode text at the word level. Some text embeddings may encode text at the token level. A transformer for generating a text embedding may, in some implementations, be trained, at least in part, with a loss calculation which compares predicted outputs to ground truth outputs (e.g., softmax loss, multiple negatives ranking loss, MSE margin loss, cross-entropy loss or the like). In some instances, the non-text arguments, such as real values or categorical values, may be converted to text, and subsequently embedded using the techniques described herein. The following are examples of natural language instructions that may be issued by a clinician to the generative models described herein: “Generate a restoration design (alternatively a veneer design) which closes the diastema between tooth #8-9 by evenly adding width onto the mesial of both teeth”, “Generate a restoration design (alternatively a veneer design) which ensures that the incisal edges of #6-11 form an even semicircle with the incisal edges of the posterior teeth (when looking from the incisal view)”, or “Generate a customized crown for an upper left central incisor, to be implanted. The crown shape should take into consideration the shape of adjacent teeth and should have no more than x mm (e.g.: 0.1mm) space between the adjacent teeth." [00192] Techniques of this disclosure may, in some implementations, use PointNet, PointNet++, or derivative neural networks (e.g., networks trained via transfer learning using either PointNet or PointNet++ as a basis for training) to extract local or global neural network features from a 3D point cloud or other 3D representation (e.g., a 3D point cloud describing aspects of the patient’s dentition – such as teeth or gums). Techniques of this disclosure may, in some implementations, use U-Nets to extract local or global neural network features from a 3D point cloud or other 3D representation. [00193] 3D oral care representations are described herein as such because 3-dimensional representations are currently state of the art. Nevertheless, 3D oral care representations are intended to be used in a non-limiting fashion to encompass any representations of 3-dimensions or higher orders of dimensionality (e.g., 4D, 5D, etc.), and it should be appreciated that machine learning models can be trained using the techniques disclosed herein to operate on representations of higher orders of dimensionality. [00194] In some instances, input data may comprise 3D mesh data, 3D point cloud data, 3D surface data, 3D polyline data, 3D voxel data, or data pertaining to a spline (e.g., control points). An encoder- decoder structure may comprise one or more encoders, or one or more decoders. In some implementations, the encoder may take as input mesh element feature vectors for one or more of the inputted mesh elements, to improve the ability of the encoder to generate a representation of the input data. Examples of encoder-decoder structures include U-Nets, autoencoders or transformers (among others). A representation generation module may comprise one or more encoder-decoder structures (or portions of encoders-decoder structures – such as individual encoders or individual decoders). A representation generation module may generate an information-rich (optionally reduced-dimensionality) representation of the input data, which may be more easily consumed by other generative or discriminative machine learning models. [00195] A U-Net may comprise an encoder, followed by a decoder. The architecture of a U-Net may resemble a U shape. The encoder may extract one or more global neural network features from the input 3D representation, zero or more intermediate-level neural network features, or one or more local neural network features (at the most local level as contrasted with the most global level). The output from each level of the encoder may be passed along to the input of corresponding levels of a decoder (e.g., by way of skip connections). Like the encoder, the decoder may operate on multiple levels of global-to-local neural network features. For instance, the decoder may output a representation of the input data which may contain global, intermediate or local information about the input data. The U-Net may, in some implementations, generate an information-rich (optionally reduced-dimensionality) representation of the input data, which may be more easily consumed by other generative or discriminative machine learning models. [00196] An autoencoder may be configured to encode the input data into a latent form. An autoencoder may train an encoder to reformat the input data into a reduced-dimensionality latent form in between the encoder and the decoder, and then train a decoder to reconstruct the input data from that latent form of the data. A reconstruction error may be computed to quantify the extent to which the reconstructed form of the data differs from the input data. The latent form may, in some implementations, be used as an information-rich reduced-dimensionality representation of the input data which may be more easily consumed by other generative or discriminative machine learning models. In most scenarios, an autoencoder may be trained to input a 3D representation, encode that 3D representation into a latent form (e.g., a latent embedding), and then reconstruct a close facsimile of that input 3D representation at the output. [00197] A transformer may be trained to use self-attention to generate, at least in part, representations of its input. A transformer may encode long-range dependencies (e.g., encode relationships between a large number of inputs). A transformer may comprise an encoder or a decoder. Such an encoder may, in some implementations, operate in a bi-directional fashion or may operate a self-attention mechanism. Such a decoder may, in some implementations, may operate a masked self-attention mechanism, may operate a cross-attention mechanism, or may operate in an auto-regressive manner. The self-attention operations of the transformers described herein may, in some implementations, relate different positions or aspects of an individual 3D oral care representation in order to compute a reduced-dimensionality representation of that 3D oral care representation. The cross-attention operations of the transformers described herein may, in some implementations, mix or combine aspects of two (or more) different 3D oral care representations. The auto-regressive operations of the transformers described herein may, in some implementations, consume previously generated aspects of 3D oral care representations (e.g., previously generated points, point clouds, transforms, etc.) as additional input when generating a new or modified 3D oral care representation. The transformer may, in some implementations, generate a latent form of the input data, which may be used as an information-rich reduced-dimensionality representation of the input data, which may be more easily consumed by other generative or discriminative machine learning models. [00198] In some implementations, an encoder-decoder structure may first be trained as an autoencoder. In deployment, one or more modifications may be made to the latent form of the input data. This modified latent form may then proceed to be reconstructed by the decoder, yielding a reconstructed form of the input data which differs from the input data in one or more intended aspects. Oral care arguments, such as oral care parameters or oral care metrics may be supplied to the encoder, the decoder, or may be used in the modification of the latent form, to influence the encoder-decoder structure in generating a reconstructed form that has desired characteristics (e.g., characteristics which may differ from that of the input data). [00199] Techniques of this disclosure may, in some instances, be trained using federated learning. Federated learning may enable multiple remote clinicians to iteratively improve a machine learning model (e.g., validation of 3D oral care representations, mesh segmentation, mesh cleanup, other techniques which involve labeling mesh elements, coordinate system prediction, non-organic object placement on teeth, appliance component generation, tooth restoration design generation, techniques for placing 3D oral care representations, setups prediction, generation or modification of 3D oral care representations using autoencoders, generation or modification of 3D oral care representations using transformers, generation or modification of 3D oral care representations using diffusion models, 3D oral care representation classification, imputation of missing values), while protecting data privacy (e.g., the clinical data may not need to be sent “over the wire” to a third party). Data privacy is particularly important to clinical data, which is protected by applicable laws. A clinician may receive a copy of a machine learning model, use a local machine learning program to further train that ML model using locally available data from the local clinic, and then send the updated ML model back to the central hub or third party. The central hub or third party may integrate the updated ML models from multiple clinicians into a single updated ML model which benefits from the learnings of recently collected patient data at the various clinical sites. In this way, a new ML model may be trained which benefits from additional and updated patient data (possibly from multiple clinical sites), while those patient data are never actually sent to the 3rd party. Training on a local in-clinic device may, in some instances, be performed when the device is idle or otherwise be performed during off-hours (e.g., when patients are not being treated in the clinic). Devices in the clinical environment for the collection of data and/or the training of ML models for techniques described here may include intra-oral scanners, CT scanners, X-ray machines, laptop computers, servers, desktop computers or handheld devices (such as smart phones with image collection capability). In addition to federated learning techniques, in some implementations, contrastive learning may be used to train, at least in part, the ML models described herein. Contrastive learning may, in some instances, augment samples in a training dataset to accentuate the differences in samples from difference classes and/or increase the similarity of samples of the same class. [00200] Machine learning models such as: U-Nets, encoders, autoencoders, pyramid encoder- decoders, transformers, or convolution and/or pooling layers, may be trained as a part of a method for hardware (or appliance component) placement. Representation learning may train a first module to determine an embedded representation of a 3D oral care representation (e.g., encoding a mesh or point cloud into a latent form using an autoencoder, or using a U-Net, encoder, transformer, block of convolution and/or pooling layers or the like). That representation may comprise a reduced dimensionality form and/or information-rich version of the inputted 3D oral care representation. In some implementations, the generation of a representation may be aided by the calculation of a mesh element feature vector for one or more mesh elements (e.g., each mesh element). In some implementations, a representation may be computed for a hardware element (or appliance component). Such representations are suitable to be provided to a second module, which may perform a generative task, such as oral care metric generation. In the instance where a U-Net (among other neural networks) is trained to generate the representations of tooth meshes, the mesh convolution and/or mesh pooling techniques described herein enjoy invariance to rotations/translations/scaling of that tooth mesh. Examples: Example 1. A method for generating one or more target shapes for a tooth, the comprising: receiving, by processing circuitry of a computing device, a digital 3D model of the tooth in a pre- restoration state; executing, by the processing circuitry, a scoring function using one or more dental restoration metrics related to the tooth in the pre-restoration state as input to generate a score associated with the tooth in the pre-restoration state; modifying, by the processing circuitry, at least one of a structure or a shape of the tooth to form modified aspects of the tooth; and updating, by the processing circuitry, the at least one of the structure or the shape of the tooth based on the score and the modified aspects of the tooth to generate one or more post-restoration states of the tooth after implementation of a dental restoration treatment on the tooth. Example 2. The method of Example 1, further comprising obtaining, by the processing circuitry, position information associated with the tooth from the 3D digital model. Example 3. The method of Example 1, further comprising obtaining, by the processing circuitry, landmark information associated with the tooth from the 3D digital model. Example 4. The method of Example 1, wherein the scoring function is represented by the formula: ^_{^^^^(^) =} ^{∑^_^^^^^^^} ^_{^^ ^^^^(^^) ,}

where X represents a vector of metrics computed for a state, P_i is a penalty function that computes an error or penalty given a value of a metric and acceptable levels for that metric, and w_i is a weight associated with the penalty for the metric. Example 5. The method of Example 1, wherein the scoring function is represented by f(P(x)), where f is a function selected from one or more of linear, non-linear, and a probabilistic framework functions. Example 6. The method of Example 1, wherein modifying the at least one of the structure or the shape of the tooth comprises modifying at least one of the position or orientation of a mesh element, by the processing circuitry, within the digital 3D model of a tooth to generate the modified state of a tooth. Example 7. The method of Example 6, wherein a mesh element comprises at least one of a point, a vertex, an edge, a face or a voxel. Example 8. The method of Example 6, wherein a smoothing operation is applied to one or more mesh elements. Example 9. The method of Example 6, wherein one or more mesh elements are removed from the tooth. Example 10. The method of Example 6, wherein one or more mesh elements are added to the tooth. Example 11. The method of Example 1, wherein the one or more post-restoration states of the tooth are used to generate a design for a dental restoration appliance. Example 12. The method of Example 1, wherein the one or more post-restoration states of the tooth are used to generate a design for an orthodontic appliance. Example 13. The method of Example 12, wherein the orthodontic appliance is a clear tray aligner (CTA). Example 14. The method of Example 1, wherein the computing device is deployed at a clinical context, and wherein the method is performed at the clinical context. Example 15. The method of Example 1, wherein the one or more dental restoration metrics are further provided to a machine learning model which has been trained to generate a 3D oral care representation. Example 16. The method of Example 15, wherein the generated 3D oral care representation comprises one or more post-restoration states of the tooth.

Claims

CLAIMS WHAT IS CLAIMED IS: 1. A method of visualizing at least one oral care metric, the method comprising: receiving, by processing circuitry of a computing device, a plurality of three-dimensional (3D) representations of oral care data; computing, by the processing circuitry, a plurality of oral care metrics based on the plurality of 3D representations; converting, by the processing circuitry, the plurality of oral care metrics into a multidimensional format; reducing, by the processing circuitry, a dimensionality of each oral care metric in the plurality of the oral care metrics in the multidimensional format to generate a reduced-dimensionality version of the plurality oral care metrics; and rendering, by the processing circuitry, the reduced-dimensionality version of the plurality of oral care metrics in a visualized form.

2. The method of claim 1, further comprising transmitting, by the processing circuitry, the plurality of oral care metrics rendered in the visualized form to a system configured to construct an oral care appliance or a component of the oral care appliance using the plurality of oral care metrics rendered in the visualized form.

3. The method of claim 2, wherein the oral care appliance comprises at least one of a dental restoration appliance or an orthodontic appliance.

4. The method of claim 1, wherein at least one oral care metric quantifies a geometrical relationship between at least two teeth in a respective 3D representation of oral care data in the plurality of 3D representations of oral care data.

5. The method of claim 1, wherein the plurality of oral care metrics quantifies at least one of the structure or a shape of an individual tooth in a respective 3D representation of oral care data in the plurality of 3D representations of oral care data.

6. The method of claim 1, further comprising rendering, by the processing circuitry, a graphical user interface (GUI) element that indicates a progression from a maloccluded setup to a final setup associated with orthodontic treatment plan for a patient.

7. The method of claim 6, further comprising validating a fitness of a setup series that includes one or more of the maloccluded setup, the final setup, and one or more intermediate stages associated with the orthodontic treatment plan for the patient.

8. The method of claim 6, wherein the orthodontic treatment plan for the patient involves a use of a clear tray aligner (CTA).

9. The method of claim 1, wherein the reducing uses one or more of Stochastic Neighbor Embedding (SNE), t-Distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection for Dimension Reduction (U-Map), Locally Linear Embedding (LLE), Symmetric SNE, Isomap, or Sammon Mapping.

10. The method of claim 1, wherein the plurality of three-dimensional (3D) representations of oral care data pertain to respective patient cases and each of the respective patient cases in the reduced- dimensionality version is specified by a respective tuple in a plurality of tuples and the method further comprising: providing the plurality of tuples to a clustering module; and performing, by the clustering module, an unsupervised clustering on the plurality of tuples.

11. The method of claim 10, further comprising generating one or more histograms based at least in part on the supervised clustering.

12. The method of claim 10, wherein each patient case includes representations of the patient’s teeth and respective oral care metrics in the plurality of oral care metrics quantify relationships between one or more of the patient’s teeth.

13. An apparatus for visualizing at least one oral care metric, the apparatus comprising: means for receiving one or more three-dimensional (3D) representations of oral care data; means for computing the plurality of oral care metrics based on the one or more 3D representations of oral care data; means for converting the plurality of oral care metrics into a multidimensional format; means for reducing a dimensionality of the plurality of oral care metrics in the multidimensional format to form a reduced-dimensionality version of the plurality of oral care metrics; and means for rendering the reduced-dimensionality version of the plurality of oral care metrics in a visualized form.

14. A device for of visualizing at least one oral care metric, the device comprising: interface hardware configured to receive one or more three-dimensional (3D) representations of oral care data; memory hardware configured to store the one or more 3D representations of oral care data; and processing circuitry in communication with the memory hardware, the processing circuitry being configured to: compute the plurality of oral care metrics based on a plurality 3D oral representations or oral care data; convert the plurality of oral care metrics into a multidimensional format; reduce a dimensionality of the plurality of oral care metrics in the multidimensional format to form a reduced-dimensionality version of the plurality of oral care metrics; and render the reduced-dimensionality version of the plurality of oral care metrics in a visualized form.

15. The device of claim 14, wherein the processing circuitry is further configured to transmit, via the interface hardware, the plurality of oral care metrics rendered in the visualized form to a system configured to construct an oral care appliance or a component of the oral care appliance using the plurality of oral care metrics rendered in the visualized form.

16. The device of claim 15, wherein the oral care appliance comprises at least one of a dental restoration appliance or an orthodontic appliance.

17. The device of claim 14, wherein the plurality of oral care metrics quantifies a geometrical relationship between at least two teeth in a respective 3D representation of oral care data in the plurality of 3D representations of oral care data.

18. The device of claim 14, wherein the plurality of oral care metrics quantifies at least one of the structure or a shape of an individual tooth in a respective 3D representation of oral care data in the plurality of 3D representations of oral care data.

19. The device of claim 14, wherein the processing circuitry is further configured to render a graphical user interface (GUI) element that indicates a progression from a maloccluded setup to a final setup associated with orthodontic treatment plan for a patient.

20. The device of claim 10, wherein the computing device is deployed at a clinical context.