[go: up one dir, main page]

WO2025108661A1 - Apprentissage profond contrastif pour inspection de défauts - Google Patents

Apprentissage profond contrastif pour inspection de défauts Download PDF

Info

Publication number
WO2025108661A1
WO2025108661A1 PCT/EP2024/080483 EP2024080483W WO2025108661A1 WO 2025108661 A1 WO2025108661 A1 WO 2025108661A1 EP 2024080483 W EP2024080483 W EP 2024080483W WO 2025108661 A1 WO2025108661 A1 WO 2025108661A1
Authority
WO
WIPO (PCT)
Prior art keywords
training
image
input
feature vector
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/EP2024/080483
Other languages
English (en)
Inventor
Tiago BOTARI
Daniel Robert FAATZ
Tim Jeroen SCHOONBEEK
Bart PRONK
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ASML Netherlands BV
Original Assignee
ASML Netherlands BV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ASML Netherlands BV filed Critical ASML Netherlands BV
Publication of WO2025108661A1 publication Critical patent/WO2025108661A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • G06T7/001Industrial image inspection using an image reference approach
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/98Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
    • G06V10/993Evaluation of the quality of the acquired pattern
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30148Semiconductor; IC; Wafer

Definitions

  • the embodiments provided herein generally relate to a defect detection technique, and more particularly, to a defect detection technique utilizing a contrastive deep learning methodology.
  • a method for defect detection for a wafer inspection system can comprise acquiring a first input and a second input, generating a first feature vector for the first input and a second feature vector for the second input using a machine learning model, determining a similarity value between the first feature vector and the second feature vector, and determining whether the target inspection image contains a defect based on the similarity value.
  • an apparatus for defect detection for a wafer inspection system can comprise a memory storing a set of instructions; and at least one processor configured to execute the set of instructions to cause the apparatus to perform: acquiring a first input and a second input, generating a first feature vector for the first input and a second feature vector for the second input using a machine learning model, determining a similarity value between the first feature vector and the second feature vector, and determining whether the target inspection image contains a defect based on the similarity value.
  • a non-transitory computer readable medium that stores a set of instructions that is executable by at least on processor of a computing device to cause the computing device to perform a method for defect detection for a wafer inspection system.
  • the method can comprise acquiring a first input and a second input, generating a first feature vector for the first input and a second feature vector for the second input using a machine learning model, determining a similarity value between the first feature vector and the second feature vector, and determining whether the target inspection image contains a defect based on the similarity value.
  • FIG. 1 is a schematic diagram illustrating an example charged-particle beam inspection system, consistent with embodiments of the present disclosure.
  • FIG. 2A is a schematic diagram illustrating an example multi-beam tool, consistent with embodiments of the present disclosure that can be a part of the example charged-particle beam inspection system of FIG. 1.
  • FIG. 2B is a schematic diagram illustrating an example single-beam tool, consistent with embodiments of the present disclosure that can be a part of the example charged-particle beam inspection system of FIG. 1.
  • FIG. 3A is a block diagram of an example defect detection system, consistent with embodiments of the present disclosure.
  • FIG. 3B is a block diagram of part of another example defect detection system, consistent with embodiments of the present disclosure.
  • FIG. 4A illustrates example input data generated by an input generator, consistent with embodiments of the present disclosure.
  • FIG. 4B illustrates example operation blocks of an example encoder, consistent with embodiments of the present disclosure.
  • FIG. 4C illustrates an example process of generating a feature vector based on two input images, consistent with embodiments of the present disclosure.
  • FIG. 5A is a block diagram of an example training system, consistent with embodiments of the present disclosure.
  • FIG. 5B is a block diagram of another example training system, consistent with embodiments of the present disclosure.
  • FIG. 5C is a block diagram of yet another example training system, consistent with embodiments of the present disclosure.
  • FIG. 6A illustrates example test images utilized in performance evaluation experiments for a baseline model and a contrastive CNN model, consistent with embodiments of the present disclosure.
  • FIG. 6B illustrates base images without noise corresponding to test images of FIG. 6A.
  • FIG. 6C illustrates training data that is used for training a contrastive CNN model of
  • FIG. 6A is a diagrammatic representation of FIG. 6A.
  • FIG. 7A and FIG. 7B are graphs illustrating experiment results according to a baseline model and a contrastive CNN model, consistent with embodiments of the present disclosure.
  • FIG. 8A illustrates example test images utilized in performance evaluation experiments for a conventional CNN model and a contrastive CNN model, consistent with embodiments of the present disclosure.
  • FIG. 8B and FIG. 8C are graphs illustrating experiment results according to a conventional CNN model and a contrastive CNN model, consistent with embodiments of the present disclosure.
  • FIG. 9 is a flow chart illustrating an example method for defect detection, consistent with embodiments of the present disclosure.
  • Electronic devices are constructed of circuits formed on a piece of semiconductor material called a substrate.
  • the semiconductor material may include, for example, silicon, gallium arsenide, indium phosphide, or silicon germanium, or the like.
  • Many circuits may be formed together on the same piece of silicon and are called integrated circuits or ICs.
  • the size of these circuits has decreased dramatically so that many more of them can be fit on the substrate.
  • an IC chip in a smartphone can be as small as a thumbnail and yet may include over 2 billion transistors, the size of each transistor being less than 1/1000th the size of a human hair.
  • One component of improving yield is monitoring the chip-making process to ensure that it is producing a sufficient number of functional integrated circuits.
  • One way to monitor the process is to inspect the chip circuit structures at various stages of their formation. Inspection can be carried out using a scanning charged-particle microscope (“SCPM”).
  • SCPM scanning charged-particle microscope
  • SEM scanning electron microscope
  • a SCPM can be used to image these extremely small structures, in effect, taking a “picture” of the structures of the wafer. The image can be used to determine if the structure was formed properly in the proper location. If the structure is defective, then the process can be adjusted, so the defect is less likely to recur.
  • Defect inspection involves measurements of device structures using inspection images during wafer fabrication processes, and then the measurements are further processed to identify possible defects on the wafer. Normally, defect inspection can be performed by comparison of different regions of inspection images that contain the same structure.
  • One of the current methodologies identifies defects by performing a pixel-wise comparison between one target image and two reference images, which will be referred to as a “baseline methodology” in the present disclosure.
  • the reference images are images corresponding to the target image and are determined not to have any known defects.
  • the baseline methodology the target image is subtracted from the two reference images, respectively, and the results of the two subtractions are multiplied.
  • the target image and reference images should be precisely aligned to each other for accurate defect detection.
  • the target image and the reference images can also be linear filtered to enhance images before performing the image subtractions. Because the current defect detection methodology is designed by experts based on limited number of use cases, the linear filters that are utilized in the baseline methodology may not be the optimal solution for a specific application. Further, because the baseline methodology detects defects based on pixel-to- pixel comparison, even a little noise could cause false negatives or false positives in detecting defects. The same is true for a misalignment between the target image and reference images, i.e., small misalignments can lead to false positives (nuisances) or false negatives.
  • CNN convolutional neural network
  • a convolutional neural network (CNN) methodology using a trained machine learning model is referred to as a “convolutional CNN methodology” in the present disclosure.
  • CNN convolutional neural network
  • a machine learning model is trained to classify whether image patches contain defective structures or not using a large amount of training images containing defects. Then the trained machine learning model, e.g., CNN model, can identify defects based on inference. While the conventional CNN methodology suffers less from misalignment or noise, it may not present stable defect detection performance over various defect patterns.
  • a trained machine learning model e.g., conventional CNN model
  • a trained machine learning model may need to be retrained with training defect patterns similar to the target defect patterns. That is, the conventional CNN methodology has shortcomings such that it is not well generalized over defect patterns, which are deviated from training data. Therefore, there is a demand in the industry to provide a defect detection method that can provide good generalization while not vulnerable to misalignment.
  • a defect detection system that can detect a defect of a target inspection image based on similarity between a feature vector of the target inspection image and a feature vector of a reference image.
  • an encoder that is configured to extract feature vectors are provided.
  • an encoder can be trained to generate feature vectors to contain information that enables classification between a defective structure and a non-defective structure for a target image using one or more reference images.
  • a defect can be more accurately and reliably detected using a framework of contrastive deep learning methodology, which is adapted for defect inspection using inspection images.
  • a defect detection system that can present improved generalization performance can be provided.
  • one target image and two reference images can be used as input.
  • a first feature vector can be encoded for the target image and a first reference image and a second feature vector for the first reference image and a second reference image.
  • the encoded feature vectors are compared to measure similarity between them. According to some embodiments of the present disclosure, whether the target image contains a defect can be determined based on the measured similarity.
  • CTRs cathode ray tubes
  • SCPMs scanning charged-particle microscopes
  • the term “or” encompasses all possible combinations, except where infeasible. For example, if it is stated that a component includes A or B, then, unless specifically stated otherwise or infeasible, the component may include A, or B, or A and B. As a second example, if it is stated that a component includes A, B, or C, then, unless specifically stated otherwise or infeasible, the component may include A, or B, or C, or A and B, or A and C, or B and C, or A and B and C.
  • Expressions such as “at least one of’ do not necessarily modify an entirety of a following list and do not necessarily modify each member of the list, such that “at least one of A, B, and C” should be understood as including only one of A, only one of B, only one of C, or any combination of A, B, and C.
  • the phrase “one of A and B” or “any one of A and B” shall be interpreted in the broadest sense to include one of A, or one of B.
  • FIG. 1 illustrates an example electron beam inspection (EBI) system 100 consistent with embodiments of the present disclosure.
  • EBI system 100 may be used for imaging.
  • EBI system 100 includes a main chamber 101, a load/lock chamber 102, a beam tool 104, and an equipment front end module (EFEM) 106.
  • Beam tool 104 is located within main chamber 101.
  • EFEM 106 includes a first loading port 106a and a second loading port 106b.
  • EFEM 106 may include additional loading port(s).
  • First loading port 106a and second loading port 106b receive wafer front opening unified pods (FOUPs) that contain wafers (e.g., semiconductor wafers or wafers made of other material(s)) or samples to be inspected (wafers and samples may be used interchangeably).
  • a “lot” is a plurality of wafers that may be loaded for processing as a batch.
  • One or more robotic arms (not shown) in EFEM 106 may transport the wafers to load/lock chamber 102.
  • Load/lock chamber 102 is connected to a load/lock vacuum pump system (not shown) which removes gas molecules in load/lock chamber 102 to reach a first pressure below the atmospheric pressure. After reaching the first pressure, one or more robotic arms (not shown) may transport the wafer from load/lock chamber 102 to main chamber 101.
  • Main chamber 101 is connected to a main chamber vacuum pump system (not shown) which removes gas molecules in main chamber 101 to reach a second pressure below the first pressure. After reaching the second pressure, the wafer is subject to inspection by beam tool 104.
  • Beam tool 104 may be a single-beam system or a multi-beam system.
  • a controller 109 is electronically connected to beam tool 104. Controller 109 may be a computer configured to execute various controls of EBI system 100. While controller 109 is shown in FIG. 1 as being outside of the structure that includes main chamber 101, load/lock chamber 102, and EFEM 106, it is appreciated that controller 109 may be a part of the structure.
  • controller 109 may include one or more processors (not shown).
  • a processor may be an electronic device capable of manipulating or processing information.
  • the processor may include any combination of any number of a central processing unit (or “CPU”), a graphics processing unit (or “GPU”), an optical processor, a programmable logic controllers, a microcontroller, a microprocessor, a digital signal processor, an intellectual property (IP) core, a Programmable Logic Array (PLA), a Programmable Array Logic (PAL), a Generic Array Logic (GAL), a Complex Programmable Logic Device (CPLD), a Field-Programmable Gate Array (FPGA), a System On Chip (SoC), an Application-Specific Integrated Circuit (ASIC), a neural processing unit (NPU), and any other type circuit capable of data processing.
  • the processor may also be a virtual processor that includes one or more processors distributed across multiple machines or devices coupled via a network.
  • controller 109 may further include one or more memories (not shown).
  • a memory may be an electronic device capable of storing codes and data accessible by the processor (e.g., via a bus).
  • the memory may include any combination of any number of a random-access memory (RAM), a read-only memory (ROM), an optical disc, a magnetic disk, a hard drive, a solid-state drive, a flash drive, a security digital (SD) card, a memory stick, a compact flash (CF) card, or any type of storage device.
  • the codes and data may include an operating system (OS) and one or more application programs (or “apps”) for specific tasks.
  • the memory may also be a virtual memory that includes one or more memories distributed across multiple machines or devices coupled via a network.
  • FIG. 2A illustrates a schematic diagram of an example multi-beam beam tool 104A (also referred to herein as apparatus 104A) and an image processing system 290 that may be configured for use in EBI system 100 (FIG. 1), consistent with embodiments of the present disclosure.
  • apparatus 104A also referred to herein as apparatus 104A
  • image processing system 290 that may be configured for use in EBI system 100 (FIG. 1), consistent with embodiments of the present disclosure.
  • Beam tool 104A comprises a charged-particle source 202, a gun aperture 204, a condenser lens 206, a primary charged-particle beam 210 emitted from charged-particle source 202, a source conversion unit 212, a plurality of beamlets 214, 216, and 218 of primary charged-particle beam 210, a primary projection optical system 220, a motorized wafer stage 280, a wafer holder 282, multiple secondary charged-particle beams 236, 238, and 240, a secondary optical system 242, and a charged-particle detection device 244.
  • Primary projection optical system 220 can comprise a beam separator 222, a deflection scanning unit 226, and an objective lens 228.
  • Charged-particle detection device 244 can comprise detection sub-regions 246, 248, and 250. [0044] Charged-particle source 202, gun aperture 204, condenser lens 206, source conversion unit 212, beam separator 222, deflection scanning unit 226, and objective lens 228 can be aligned with a primary optical axis 260 of apparatus 104 A. Secondary optical system 242 and charged-particle detection device 244 can be aligned with a secondary optical axis 252 of apparatus 104A.
  • Charged-particle source 202 can emit one or more charged particles, such as electrons, protons, ions, muons, or any other particle carrying electric charges.
  • charged- particle source 202 may be an electron source.
  • charged-particle source 202 may include a cathode, an extractor, or an anode, wherein primary electrons can be emitted from the cathode and extracted or accelerated to form primary charged-particle beam 210 (in this case, a primary electron beam) with a crossover (virtual or real) 208.
  • primary charged-particle beam 210 in this case, a primary electron beam
  • crossover virtual or real
  • Primary charged-particle beam 210 can be visualized as being emitted from crossover 208.
  • Gun aperture 204 can block off peripheral charged particles of primary charged-particle beam 210 to reduce Coulomb effect. The Coulomb effect may cause an increase in size of probe spots.
  • Source conversion unit 212 can comprise an array of image-forming elements and an array of beam-limit apertures.
  • the array of image-forming elements can comprise an array of microdeflectors or micro-lenses.
  • the array of image-forming elements can form a plurality of parallel images (virtual or real) of crossover 208 with a plurality of beamlets 214, 216, and 218 of primary charged-particle beam 210.
  • the array of beam-limit apertures can limit the plurality of beamlets 214, 216, and 218. While three beamlets 214, 216, and 218 are shown in FIG. 2A, embodiments of the present disclosure are not so limited.
  • the apparatus 104A may be configured to generate a first number of beamlets.
  • the first number of beamlets may be in a range from 1 to 1000. In some embodiments, the first number of beamlets may be in a range from 200-500. In some embodiments, an apparatus 104A may generate 400 beamlets.
  • Condenser lens 206 can focus primary charged-particle beam 210. The electric currents of beamlets 214, 216, and 218 downstream of source conversion unit 212 can be varied by adjusting the focusing power of condenser lens 206 or by changing the radial sizes of the corresponding beamlimit apertures within the array of beam-limit apertures.
  • Objective lens 228 can focus beamlets 214, 216, and 218 onto a wafer 230 for imaging, and can form a plurality of probe spots 270, 272, and 274 on a surface of wafer 230.
  • Beam separator 222 can be a beam separator of Wien filter type generating an electrostatic dipole field and a magnetic dipole field. In some embodiments, if they are applied, the force exerted by the electrostatic dipole field on a charged particle (e.g., an electron) of beamlets 214, 216, and 218 can be substantially equal in magnitude and opposite in a direction to the force exerted on the charged particle by magnetic dipole field. Beamlets 214, 216, and 218 can, therefore, pass straight through beam separator 222 with zero deflection angle. However, the total dispersion of beamlets 214, 216, and 218 generated by beam separator 222 can also be non-zero. Beam separator 222 can separate secondary charged-particle beams 236, 238, and 240 from beamlets 214, 216, and 218 and direct secondary charged-particle beams 236, 238, and 240 towards secondary optical system 242.
  • a charged particle e.g., an electron
  • Deflection scanning unit 226 can deflect beamlets 214, 216, and 218 to scan probe spots 270, 272, and 274 over a surface area of wafer 230.
  • secondary charged-particle beams 236, 238, and 240 may be emitted from wafer 230.
  • Secondary charged-particle beams 236, 238, and 240 may comprise charged particles (e.g., electrons) with a distribution of energies.
  • secondary charged- particle beams 236, 238, and 240 may be secondary electron beams including secondary electrons (energies ⁇ 50 eV) and backscattered electrons (energies between 50 eV and landing energies of beamlets 214, 216, and 218).
  • Secondary optical system 242 can focus secondary charged-particle beams 236, 238, and 240 onto detection sub-regions 246, 248, and 250 of charged-particle detection device 244.
  • Detection sub-regions 246, 248, and 250 may be configured to detect corresponding secondary charged-particle beams 236, 238, and 240 and generate corresponding signals (e.g., voltage, current, or the like) used to reconstruct an inspection image of structures on or underneath the surface area of wafer 230.
  • the generated signals may represent intensities of secondary charged-particle beams 236, 238, and 240 and may be provided to image processing system 290 that is in communication with charged-particle detection device 244, primary projection optical system 220, and motorized wafer stage 280.
  • the movement speed of motorized wafer stage 280 may be synchronized and coordinated with the beam deflections controlled by deflection scanning unit 226, such that the movement of the scan probe spots (e.g., scan probe spots 270, 272, and 274) may orderly cover regions of interest on the wafer 230.
  • the parameters of such synchronization and coordination may be adjusted to adapt to different materials of wafer 230. For example, different materials of wafer 230 may have different resistance-capacitance characteristics that may cause different signal sensitivities to the movement of the scan probe spots.
  • the intensity of secondary charged-particle beams 236, 238, and 240 may vary according to the external or internal structure of wafer 230, and thus may indicate whether wafer 230 includes defects. Moreover, as discussed above, beamlets 214, 216, and 218 may be projected onto different locations of the top surface of wafer 230, or different sides of local structures of wafer 230, to generate secondary charged-particle beams 236, 238, and 240 that may have different intensities. Therefore, by mapping the intensity of secondary charged-particle beams 236, 238, and 240 with the areas of wafer 230, image processing system 290 may reconstruct an image that reflects the characteristics of internal or external structures of wafer 230.
  • image processing system 290 may include an image acquirer 292, a storage 294, and a controller 296.
  • Image acquirer 292 may comprise one or more processors.
  • image acquirer 292 may comprise a computer, server, mainframe host, terminals, personal computer, any kind of mobile computing devices, or the like, or a combination thereof.
  • Image acquirer 292 may be communicatively coupled to charged-particle detection device 244 of beam tool 104A through a medium such as an electric conductor, optical fiber cable, portable storage media, IR, Bluetooth, internet, wireless network, wireless radio, or a combination thereof.
  • image acquirer 292 may receive a signal from charged-particle detection device 244 and may construct an image.
  • Image acquirer 292 may thus acquire inspection images of wafer 230. Image acquirer 292 may also perform various post-processing functions, such as generating contours, superimposing indicators on an acquired image, or the like. Image acquirer 292 may be configured to perform adjustments of brightness and contrast of acquired images.
  • storage 294 may be a storage medium such as a hard disk, flash drive, cloud storage, random access memory (RAM), other types of computer-readable memory, or the like. Storage 294 may be coupled with image acquirer 292 and may be used for saving scanned raw image data as original images, and postprocessed images. Image acquirer 292 and storage 294 may be connected to controller 296. In some embodiments, image acquirer 292, storage 294, and controller 296 may be integrated together as one control unit.
  • image acquirer 292 may acquire one or more inspection images of a wafer based on an imaging signal received from charged-particle detection device 244.
  • An imaging signal may correspond to a scanning operation for conducting charged particle imaging.
  • An acquired image may be a single image comprising a plurality of imaging areas.
  • the single image may be stored in storage 294.
  • the single image may be an original image that may be divided into a plurality of regions. Each of the regions may comprise one imaging area containing a feature of wafer 230.
  • the acquired images may comprise multiple images of a single imaging area of wafer 230 sampled multiple times over a time sequence.
  • the multiple images may be stored in storage 294.
  • image processing system 290 may be configured to perform image processing steps with the multiple images of the same location of wafer 230.
  • image processing system 290 may include measurement circuits (e.g., analog-to-digital converters) to obtain a distribution of the detected secondary charged particles (e.g., secondary electrons).
  • the charged-particle distribution data collected during a detection time window, in combination with corresponding scan path data of beamlets 214, 216, and 218 incident on the wafer surface, can be used to reconstruct images of the wafer structures under inspection.
  • the reconstructed images can be used to reveal various features of the internal or external structures of wafer 230, and thereby can be used to reveal any defects that may exist in the wafer.
  • the charged particles may be electrons.
  • the electrons of primary charged-particle beam 210 When electrons of primary charged-particle beam 210 are projected onto a surface of wafer 230 (e.g., probe spots 270, 272, and 274), the electrons of primary charged-particle beam 210 may penetrate the surface of wafer 230 for a certain depth, interacting with particles of wafer 230. Some electrons of primary charged- particle beam 210 may elastically interact with (e.g., in the form of elastic scattering or collision) the materials of wafer 230 and may be reflected or recoiled out of the surface of wafer 230.
  • An elastic interaction conserves the total kinetic energies of the bodies (e.g., electrons of primary charged- particle beam 210) of the interaction, in which the kinetic energy of the interacting bodies does not convert to other forms of energy (e.g., heat, electromagnetic energy, or the like).
  • Such reflected electrons generated from elastic interaction may be referred to as backscattered electrons (BSEs).
  • Some electrons of primary charged-particle beam 210 may inelastically interact with (e.g., in the form of inelastic scattering or collision) the materials of wafer 230.
  • An inelastic interaction does not conserve the total kinetic energies of the bodies of the interaction, in which some or all of the kinetic energy of the interacting bodies convert to other forms of energy.
  • the kinetic energy of some electrons of primary charged-particle beam 210 may cause electron excitation and transition of atoms of the materials. Such inelastic interaction may also generate electrons exiting the surface of wafer 230, which may be referred to as secondary electrons (SEs). Yield or emission rates of BSEs and SEs depend on, e.g., the material under inspection and the landing energy of the electrons of primary charged-particle beam 210 landing on the surface of the material, among others.
  • the energy of the electrons of primary charged-particle beam 210 may be imparted in part by its acceleration voltage (e.g., the acceleration voltage between the anode and cathode of charged-particle source 202 in FIG. 2A).
  • the quantity of BSEs and SEs may be more or fewer (or even the same) than the injected electrons of primary charged-particle beam 210.
  • Beam tool 104B (also referred to herein as apparatus 104B) may be an example of beam tool 104 and may be similar to beam tool 104A shown in FIG. 2A. However, different from apparatus 104A, apparatus 104B may be a single-beam tool that uses only one primary electron beam to scan one location on the wafer at a time.
  • apparatus 104B includes a wafer holder 136 supported by motorized stage 134 to hold a wafer 150 to be inspected.
  • Beam tool 104B includes an electron emitter, which may comprise a cathode 103, an anode 121, and a gun aperture 122.
  • Beam tool 104B further includes a beam limit aperture 125, a condenser lens 126, a column aperture 135, an objective lens assembly 132, and a detector 144.
  • Objective lens assembly 132 in some embodiments, may be a modified SORIL lens, which includes a pole piece 132a, a control electrode 132b, a deflector unit 132c, and an exciting coil 132d.
  • an electron beam 161 emanating from the tip of cathode 103 may be accelerated by anode 121 voltage, pass through gun aperture 122, beam limit aperture 125, condenser lens 126, and be focused into a probe spot 170 by the modified SORIL lens and impinge onto the surface of wafer 150.
  • Probe spot 170 may be scanned across the surface of wafer 150 by a deflector, such as deflector unit 132c or other deflectors in the SORIL lens.
  • Secondary or scattered particles, such as secondary electrons or scattered primary electrons emanated from the wafer surface may be collected by detector 144 to determine intensity of the beam and so that an image of an area of interest on wafer 150 may be reconstructed.
  • Image acquirer 120 may comprise one or more processors.
  • image acquirer 120 may comprise a computer, server, mainframe host, terminals, personal computer, any kind of mobile computing devices, and the like, or a combination thereof.
  • Image acquirer 120 may connect with detector 144 of beam tool 104B through a medium such as an electrical conductor, optical fiber cable, portable storage media, IR, Bluetooth, internet, wireless network, wireless radio, or a combination thereof.
  • Image acquirer 120 may receive a signal from detector 144 and may construct an image. Image acquirer 120 may thus acquire images of wafer 150.
  • Image acquirer 120 may also perform various post-processing functions, such as image averaging, generating contours, superimposing indicators on an acquired image, and the like. Image acquirer 120 may be configured to perform adjustments of brightness and contrast, etc. of acquired images.
  • Storage 130 may be a storage medium such as a hard disk, random access memory (RAM), cloud storage, other types of computer readable memory, and the like. Storage 130 may be coupled with image acquirer 120 and may be used for saving scanned raw image data as original images, and post-processed images.
  • Image acquirer 120 and storage 130 may be connected to controller 109. In some embodiments, image acquirer 120, storage 130, and controller 109 may be integrated together as one electronic control unit.
  • image acquirer 120 may acquire one or more images of a sample based on an imaging signal received from detector 144.
  • An imaging signal may correspond to a scanning operation for conducting charged particle imaging.
  • An acquired image may be a single image comprising a plurality of imaging areas that may contain various features of wafer 150.
  • the single image may be stored in storage 130. Imaging may be performed on the basis of imaging frames.
  • the condenser and illumination optics of the electron beam tool may comprise or be supplemented by electromagnetic quadrupole electron lenses.
  • electron beam tool 104B may comprise a first quadrupole lens 148 and a second quadrupole lens 158.
  • the quadrupole lenses may be used for controlling the electron beam.
  • first quadrupole lens 148 may be controlled to adjust the beam current
  • second quadrupole lens 158 may be controlled to adjust the beam spot size and beam shape.
  • the images generated by SCPM may be used for defect inspection. For example, a generated image capturing a test device region of a wafer may be compared with a reference image capturing the same test device region.
  • the reference image may be predetermined (e.g., by simulation) and include no known defect. If a difference between the generated image and the reference image exceeds a tolerance level, a potential defect may be identified.
  • the SCPM may scan multiple regions of the wafer, each region including a test device region designed as the same, and generate multiple images capturing those test device regions as manufactured. The multiple images may be compared with each other. If a difference between the multiple images exceeds a tolerance level, a potential defect may be identified.
  • machine learning may be employed in the generation of inspection images, reference images, or other images associated with apparatus 100, 104A, or 104B.
  • a machine learning system may be operated in association with, e.g., controller 109 or 296, image processing system 199 or 290, image acquirer 120 or 292, or storage unit 130 or 294 of FIGs. 1-2B.
  • machine learning may be employed in the defect detection method, e.g., method 900 of FIG. 9, in association with, e.g., a defection detection system 300 of FIGs. 3A-3B, which will be described below.
  • machine learning may be employed in a training system 500 of FIGs.
  • a machine learning system may comprise a discriminative model.
  • a machine learning system may include a generative model.
  • learning can feature two types of mechanisms: discriminative learning that may be used to create classification and detection algorithms, and generative learning that may be used to actually create models that, in the extreme, can render images.
  • a generative model may be configured for generating an image from a design clip that resembles a corresponding location on a wafer in a SEM image.
  • the discriminative model(s) may have any suitable architecture or configuration known in the art.
  • Discriminative models also called conditional models, are a class of models used in machine learning for modeling the dependence of an unobserved variable “y” on an observed variable “x.” Within a probabilistic framework, this may be done by modeling a conditional probability distribution P(ylx), which can be used for predicting y based on x. Discriminative models, as opposed to generative models, may not allow one to generate samples from the joint distribution of x and y. However, for tasks such as classification and regression that do not require the joint distribution, discriminative models may yield superior performance. On the other hand, generative models are typically more flexible than discriminative models in expressing dependencies in complex learning tasks. In addition, most discriminative models are inherently supervised and cannot easily be extended to unsupervised learning. Application specific details ultimately dictate the suitability of selecting a discriminative versus generative model.
  • a generative model can be generally defined as a model that is probabilistic in nature.
  • a “generative” model is not one that performs forward simulation or rule -based approaches and, as such, it may not be necessary to model the physics of the processes involved in generating an actual image or output (for which a simulated image or output is being generated). Instead, the generative model can be learned (in that its parameters can be learned) based on a suitable training set of data.
  • Such generative models may have a number of advantages for the embodiments described herein.
  • the generative model may be configured to have a deep learning architecture in that the generative model may include multiple layers, which may perform a number of algorithms or transformations. The number of layers included in the generative model may depend on the particular use case. For practical purposes, a suitable range of layers is from 2 layers to a few tens of layers.
  • Deep learning is a type of machine learning.
  • Machine learning can be generally defined as a type of artificial intelligence (Al) that provides computers with the ability to learn without being explicitly programmed.
  • Al artificial intelligence
  • Machine learning focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.
  • Machine learning explores the study and construction of algorithms that can learn from and make predictions on data — such algorithms overcome following strictly static program instructions by making data driven predictions or decisions, through building a model from sample inputs.
  • the machine learning described herein may be further performed as described in “Introduction to Statistical Machine Learning,” by Sugiyama, Morgan Kaufmann, 2016, 534 pages; “Discriminative, Generative, and Imitative Learning,” Jebara, MIT Thesis, 2002, 212 pages; and “Principles of Data Mining (Adaptive Computation and Machine Learning)” Hand et al., MIT Press, 2001, 578 pages; which are incorporated by reference as if fully set forth herein.
  • the embodiments described herein may be further configured as described in these references.
  • a machine learning system may comprise a neural network.
  • a model may be a deep neural network with a set of weights that model the world according to the data that it has been fed to train it.
  • Neural networks can be generally defined as a machine learning approach that is based on a collection of connected artificial neurons, inspired by a biological brain, that learns to solve problems from data. Each neural unit is connected with many others, and links can be enforcing or inhibitory in their effect on the activation state of connected neural units. These systems are self-learning and trained rather than explicitly programmed and excel in areas where the solution or feature detection is difficult to express in a traditional computer program.
  • Neural networks typically consist of multiple layers, and the signal path traverses from front to back.
  • the goal of the neural network is to solve problems in the same way that the human brain would, although several neural networks are much more abstract.
  • Modern neural network projects typically work with a few thousand to a few million neural units and millions of connections.
  • the neural network may have any suitable architecture or configuration known in the art.
  • a model may comprise convolutional and deconvolution neural network.
  • the embodiments described herein can take advantage of learning concepts such as a convolution and deconvolution neural network to solve the normally intractable representation conversion problem (e.g., rendering).
  • the model may have any convolution and deconvolution neural network configuration or architecture known in the art.
  • defect detection system 300 can comprise one or more processors and memories. It is appreciated that in various embodiments defect detection system 300 may be part of or may be separate from a charged-particle beam inspection system (e.g., EBI system 100 of FIG. 1). In some embodiments, defect detection system 300 may include one or more components (e.g., software modules) that can be implemented in controller 109 or systems 290 or 199 as discussed herein. As shown in FIG. 3A, defect detection system 300 may comprise an input generator 3100, an encoder 3200, and defect checker 330.
  • input generator 3100 can be configured to generate first input data 401 and second input data 402.
  • input generator 3100 can acquire a target image 3001 as an input image.
  • defect detection system 300 is configured to determine whether target image 3001 has a defect or not.
  • target image 3001 is an inspection image such as a SCPM image of a sample or a wafer.
  • target image 3001 can be an inspection image generated by, e.g., EBI system 100 of FIG. 1 or electron beam tool 104A of FIG. 2A or 104B of FIG. 2B.
  • target image 3001 may be obtained from a storage device or system storing the inspection image.
  • target image 3001 can be an image of structures in the region of interest of the wafer.
  • input generator 3100 can acquire a reference image, e.g., first reference image 3002 or a second reference image 3003 as an input image.
  • the reference image can be an image corresponding to target image 3001.
  • a reference image can be an inspection image that is determined not to have a defect.
  • both first reference image 3002 and second reference image 3003 can correspond with target image 3001 but first reference image 3002 and second reference image 3003 may not be identical.
  • first reference image 3002 and second reference image 3003 include an image of structures, which are objects of target image 3001.
  • first reference image 3002 and second reference image 3003 may be different from each other because multiple inspection images of the same target structures may present various variations.
  • a reference image can be a layout file for a wafer design corresponding to the inspection image.
  • the layout file can be in a Graphic Database System (GDS) format, Graphic Database System II (GDS II) format, an Open Artwork System Interchange Standard (OASIS) format, a Caltech Intermediate Format (CIF), etc.
  • the wafer design may include patterns or structures for inclusion on the wafer. The patterns or structures can be mask patterns used to transfer features from the photolithography masks or reticles to a wafer.
  • a layout in GDS or OASIS format may comprise feature information stored in a binary file format representing planar geometric shapes, text, and other information related to the wafer design.
  • a reference image can be an image rendered from the layout file.
  • input generator 3100 may include first input generator 3110 and second input generator 3120.
  • first input generator 3110 can be configured to acquire target image 3001 and first reference image 3002.
  • first input generator 3110 can be configured to concatenate two input images to be used as an input to an encoder 3200, which will be described below.
  • first input generator 3110 can combine target image 3001 and first reference image 3002 to constitute first input data 401 having two channels.
  • FIG. 4A illustrates example first input data 401 having two channels, consistent with some embodiments of the present disclosure. As shown in FIG.
  • first input data 401 is structured as a number of channels (e.g., C), each of which is a two-dimensional (2D) layer having a size of HxW.
  • first input data 401 has two channels, and each channel of first input data 401 has a size of 32x32.
  • each of target image 3001 and first reference image 3002 constitutes one channel of first input data 401.
  • input generator 3100 may adjust input images to have the same size, e.g., 32x32 as an example shown in FIG. 4A. It should be noted that input size 32x32 is merely an example, and any input sizes can be utilized in embodiments of the present disclosure.
  • second input generator 3120 can be configured to acquirer first reference image 3002 and second reference image 3003.
  • second input generator 3120 can be configured to constitute second input data 402 based on first reference image 3002 and second reference image 3003, similar to first input data 401 as shown in FIG. 4A. While input generator 3110 or 3120 combining two input images to generate input data 401 or 402 has been described, it will be appreciated that first input generator 3110 can take target image 3001 to generate first input data 401 and second input generator 3120 can take one reference image 3002 or 3003 to generate second input data 402.
  • first input generator 3110 and second input generator 3120 may adjust target image 3001 and a reference image to align to each other and have the same size, and input data 401 or 402 can be inputted to encoder 3200 as a single channel.
  • defect detection system 300 may not include input generator 3100 as shown in FIG. 3B.
  • target image 3001 can be provided to first encoder 3210 as first input data 401 and reference image 3002 can be provided to second encoder 3220 as second input data 402.
  • first input data 401 includes target image 3001 and second input data 401 includes one or more reference images.
  • encoder 3200 can be configured to generate a feature vector from input data, which is generated by input generator 3100, consistent with some embodiments of the present disclosure.
  • encoder 3200 may include first encoder 3210 and second encoder 3220.
  • first encoder 3210 can be configured to generate a first feature vector 3211 from first input data 401.
  • second encoder 3220 can be configured to generate a second feature vector 3222 from second input data 402.
  • operations of first encoder 3210 or second encoder 3220 will be described referring to encoder 3200. It will be appreciated that descriptions to encoder 3200 can be applicable to encoder 3200, first encoder 3210, or second encoder 3220 otherwise indicated.
  • first encoder 3210 and second encoder 3220 may share weights with each other. It will be also noted that first encoder 3210 and second encoder 3220 may have different weights from each other in some embodiments.
  • encoder 3200 can be implemented with a contrastive Convolutional Neural Network (CNN).
  • CNN contrastive Convolutional Neural Network
  • encoder 3200 can be implemented by various network architectures comprising, but not limited to, a Visual Geometry Group (VGG) neural network, a Residual neural network (ResNet), a Dense Convolutional network (DenseNet), or transformer-based neural network, etc.
  • VCG Visual Geometry Group
  • DenseNet Dense Convolutional network
  • transformer-based neural network etc.
  • encoder 3200 can be a non-linear encoder.
  • encoder 3200 can be configured to generate a feature vector that contains information of input data to be used in defect checker 3300, which will be described later.
  • first encoder 3210 can be a trained machine learning model to extract features from first input data 401 as first feature vector 3211.
  • features extracted from first input data 401 as first feature vector 3211 can indicate differences between target image 3001 and first reference image 3002, which can indicate a defect in target image 3001.
  • second encoder 3220 can be a trained machine learning model to extract features from second input data 402 as second feature vector 3222.
  • features extracted from second input data 402 as second feature vector 3222 can indicate possible variations presented in non-defective structures and inherent to a lithography process or other steps of semiconductor fabrication processes.
  • features extracted from first input data 401 as first feature vector 3211 can include features that may vary depending on defect-structures; and features extracted from second input data 402 as second feature vector 3222 can include features that can represent possible variations presented in non-defective structures and inherent to a lithography process or other steps of semiconductor fabrication processes.
  • FIG. 4B illustrates example operation blocks of encoder 3200, consistent with embodiments of the present disclosure.
  • encoder 3200 can include a plurality of operations blocks each of which can include one or more layers.
  • encoder 3200 can include one or more convolution blocks 3201, ReLU blocks 3202, Pooling blocks 3203, and the like.
  • Convolution blocks 3201 each can perform a convolution operation with different or similar parameters.
  • convolution blocks 3201 can have different convolution filter sizes (e.g., 3x3, 4x4, etc.), convolution filter numbers, and strides.
  • ReLU blocks 3202 can be configured to apply activation function ReLU.
  • ReLU block 3202 can be configured to be connected to convolution block 3201.
  • ReLU block can be Leaky ReLu.
  • another type of activation function blocks can be utilized, including, but not limited to, ReLU block, Sigmoid block, Tanh block, and the like.
  • pooling blocks 3203 can be configured to perform a pooling operation after or before convolution block 3201.
  • pooling blocks 3203 can be implemented as average pooling or max pooling blocks.
  • pooling block 3203 can have pooling filter with a certain size (e.g., 2x2) and a stride (e.g., stride of 2).
  • pooling block 3203 can be implemented using other neural networks, e.g., residual neural networks, transformer neural networks, etc.
  • input data 401 or 402 can be input into encoder 3200 that can perform one or more convolution operations with convolution blocks 3201, one or more activation operations with ReLU blocks 3202, or one or more pooling operations with pooling blocks 3203.
  • FIG. 4C illustrates example process 410 of generating a feature vector based on input data via encoder 3200, consistent with embodiments of the present disclosure.
  • feature vector generating process 410 is described using first input data 401 shown in FIG. 4A as an example.
  • first feature vector 3222 can be generated from first input data 401 via multiple steps.
  • first feature data 41 la of first input data 401 can be generated using convolution block 3201 having seven filters with a filter size 3x3 and using first input data 401 as input.
  • the resultant first feature data 411a has seven kernels and a size of 30x30.
  • first feature data 411a can be obtained applying ReLU block 3202 after convolution block 3201.
  • second feature data 41 lb of first input data 401 can be generated using convolution block 3201 having fourteen filters with a filter size 3x3, and the resultant second feature data 411b has fourteen kernels and a size of 28x28.
  • second feature data 411b can be obtained applying ReLU block 3202 after convolution block 3201. As shown in FIG.
  • third feature data 411c of first input data 401 can be obtained by applying pooling block 3203 having a size of 2x2 to second feature data 411b, and the resultant third feature data 411c has fourteen kernels and a size of 14x14.
  • a combination of convolution block 3201 having a size of 3x3 and ReLU block 3202 can be applied to third feature data 411c to generate fourth feature data 41 Id having 28 kernels and with a size of 12x12.
  • Pooling block 3203 having a size of 2x2 can be applied to fourth feature data 41 Id to generate fifth feature data 41 le having 28 kernels and with a size of 6x6.
  • a combination of convolution block 3201 having a size of 3x3 and ReLU block 3202 can be applied to fifth feature data 41 le to generate sixth feature data 41 If having 28 kernels and with a size of 4x4.
  • Another convolution block 3201 having a size 4x4 can be applied to sixth feature data 41 If to generate seventh feature data 411g having 28 kernels and a size of 1x1.
  • a flattening operation can be performed on seventh feature data 411g to convert seventh feature data 411g into a feature vector as a 1 -dimensional array as shown in FIG. 4C.
  • seventh feature data 411g For example, three-dimensional seventh feature data 411g having size 28x1x1 can be flattened by a flattening operation to a one-dimensional feature vector having size 28x1.
  • process 410 for generating a feature vector has been described referring to FIG. 4C applying convolution operations, pooling operations, and ReLU operations in a certain order with certain settings (e.g., filter number, filter size, output size, etc.), it will be appreciated that various operations with different settings can be utilized to generate one-dimension feature vector from multi-dimension input data.
  • process 410 for generating a feature vector has been described with respect to input data combining two images, e.g., target image 3001 and first reference image 3002 as shown in FIG. 4A, it will be appreciated that the same or similar process of generating a feature vector can be applicable to embodiments where input data is comprised of one image, e.g., target image 3001 as shown in FIG. 3B.
  • second feature vector 3222 can be obtained using second input data 402 as input to encoder 3200.
  • defect checker 3300 can be configured to determine whether target image 3001 contains a defect or not. In some embodiments, defect checker 3300 can be configured to determine whether target image 3001 contains a defect based on similarity between first feature vector 3211 and second feature vector 3222. According to some embodiments of the present disclosure, defect checker 3300 can determine that target image 3001 does not contain a defect if the similarity is greater than or equal to a certain threshold. In contrast, defect checker 3300 can determine that target image 3001 contains a defect if the similarity is less than the certain threshold.
  • an example similarity metric (1) can be expressed as below:
  • S represents a similarity value
  • v and v 2 represent first feature vector 3211 and second feature vector 3222, respectively.
  • 6 [—1, 1] represents that similarity value S ranges from -1 to 1.
  • similarity metric (1) is implemented using a cosine similarity metric to quantify similarity between two feature vectors v and v 2 , i.e., first feature vector 3211 and second feature vector 3222.
  • similarity metric (1) as similarity of two vectors v and v 2 increases, similarity value S becomes closer to 1.
  • similarity value S becomes closer to -1. While a similarity metric using a cosine similarity metric has been described, it will be appreciated that other type of similarity metrics can be utilized to determine similarity or dissimilarity between two feature vectors.
  • defect checker 3300 can be configured to determine target image 3001 does not contain a defect if similarity value S is greater than or equal to a threshold value between -1 and 1.
  • the threshold value can be determined depending on target defect detection accuracy, nuisance error tolerance, false positive or negative detection rate, etc.
  • the similarity value range or a threshold value can change according to a similarity metric utilized in embodiments of the present disclosure. For example, if a mean squared error (MSE) method is used for a similarity metric, a threshold value can be any real number.
  • MSE mean squared error
  • encoder 3200 of FIGs. 3A-3B can be implemented as a contrastive deep learning model, which can be trained by a training system before being applied to encoder 3200.
  • the contrastive deep learning model can be a Siamese Neural Network (SNN).
  • FIGs. 5A-5C are block diagrams of a training system 500 (also referred to as “apparatus 500”) for an encoder model, consistent with embodiments of the present disclosure.
  • training system 500 may comprise one or more processors and memories. It is appreciated that in various embodiments training system 500 may be part of or may be separate from a charged-particle beam inspection system (e.g., EBI system 100 of FIG. 1).
  • training system 500 may include one or more components or modules separate from and communicatively coupled to the charged-particle beam inspection system.
  • training system 500 may include one or more components (e.g., software modules) that can be implemented in controller 109 or system 290 or 199 as discussed herein.
  • training system 500 and defect detection system 300 are implemented on separate computing devices or on a same computing device.
  • training system 500 may be part of defect detection system 300 or encoder 3200 of defect detection system 300 of FIGs. 3A-3B.
  • training system 500 may comprise a training input data generator 510 and a model trainer 520.
  • training input data generator 510 can be implemented similar to input generator 3100 of FIG. 3A. Similar to input generator 3100, training input data generator 510 can acquire three training images, a training target image 501 and two training reference images 502 and 503 corresponding to the training target image 501. In some embodiments, training input data generator 510 can include first and second input generators, where the first input generator is configured to acquire and combine training target image 501 and first training reference image 502. Thereby, training input data generator 510 can generate first training input data, which is similar to first input data 401 of FIG. 3A. Similarly, training input data generator 510 can generate second training input data, similar to second input data 402 of FIG.
  • training input data generator can acquire two training images, a training target image 501 and one training reference image 502 or 503.
  • training target image 501 can be associated with a label that indicates whether the training target image 501 has a defect or not.
  • input generator 3100 of FIG. 3A can be utilized as training input data generator 510.
  • model trainer 520 is configured to train encoder model 521 to predict whether training target image 501 has a defect or not.
  • model trainer 520 is configured to train encoder model 521 under supervised learning.
  • encoder model 521 can be configured to have a first encoder model and a second encoder model.
  • the first encoder model can be utilized as first encoder 3210 and the second encoder model can be utilized as second encoder 3220 after training of encoder model 521 is complete.
  • model trainer 520 can train encoder model 521 using a plurality of sets of training target image 501 and reference images 502 and 503.
  • the objective of model trainer 520 is to maximize similarity between two feature vectors generated by encoder model 521 when training target image 501 does not include a defective structure; and to minimize similarity between two feature vectors generated by encoder model 521 when training target image 501 include a defective structure.
  • model trainer 520 is configured to update or adjust weights or parameters of encoder model 521 according to the resultant similarity values during training.
  • model trainer 520 can confirm whether an inference result of model trainer 520 is accurate or not based on the label of training target image 501, where the label indicates whether training target image 501 includes a defective structure or not.
  • model trainer 520 can measure similarity of two feature vectors generated by encoder model 521 using a similarity metric, e.g., similarity metric (1) used in defect detection system 300 of FIG. 3A.
  • model trainer 520 can train encoder model 521 using a loss function based on a similarity metric, e.g., similarity metric (1).
  • An example loss function (2) can be expressed as below:
  • L represents a loss function value
  • N represents a training batch size
  • i represents a training target image index included in the training batch where 0 ⁇ i ⁇ N.
  • Max function max returns the larges value in the given list of arguments.
  • model trainer 520 can train encoder model 521 to make loss function value L closer to 1 when training target images do not have a defective structure; and closer to - 1 when training target images have a defective structure.
  • model trainer 520 is configured to update or adjust weights or parameters of encoder model 521 based on loss function value L. While a binary entropy cross type loss function, e.g., loss function (2) is explained as a metric to train encoder model 521, it will be appreciated that any other metrics can be utilized to determine whether encoder model 521 correctly predicts a defect existence in training images.
  • trained encoder model 521 can be utilized as encoder 3200 in defect detection system 300 of FIG. 3A.
  • FIG. 5B describes a block diagram of another example training system 500 having example encoder model 521B that can be trained using contrastive loss function 532B.
  • encoder model 521B can be configured to have a first encoder model 5210 and a second encoder model 5220.
  • first encoder model 5210 can be configured to receive training target image 501 as input data and second encoder model 5220 can be configured to receive training reference image 502 corresponding to training target image 501 as input data.
  • first encoder model 5210 and second encoder model 5220 of FIG. 5B can be identical networks.
  • first encoder model 5210 can be utilized as first encoder 3210 of FIG. 3B and second encoder model 5220 can be utilized as second encoder 3220 of FIG. 3B after training of encoder model 521B is complete.
  • model trainer 520B can train encoder model 521B using a plurality of sets of training target image 501 and reference images 502.
  • the objective of model trainer 520B is to maximize similarity between two feature vectors generated by encoder model 521B when training target image 501 does not include a defective structure; and to minimize similarity between two feature vectors generated by encoder model 52 IB when training target image 501 include a defective structure.
  • model trainer 520B can train encoder model 521B using a contrastive loss function 532B, which can be expressed as below:
  • L con represents a contrastive loss function value
  • d represents the L2-norm distance between the two feature vectors generated by first encoder model 5210 and second encoder model 5220 of FIG. 5B.
  • distance d can be represented as ,?l ' £ represents a marginal distance to classify a defective image and a non- defective image.
  • margin E is incorporated into contrastive loss function 532B to assist an optimization algorithm such that energy can be focused on instances of which classification is difficult.
  • y t represents a label value of i th training target image depending on defect existence.
  • model trainer 520B can train encoder model 52 IB to make loss function value L con closer to 0 in either case: when training target images do not have a defective structure; or when training target images have a defective structure.
  • model trainer 520B is configured to update or adjust weights or parameters of encoder model 521B based on loss function value L con .
  • FIG. 5C describes a block diagram of another training system 500 having an example encoder model 521C that can be trained using a triplet loss function 532C.
  • encoder model 521C can be configured to have a first encoder model 5210, a second encoder model 5220, and a third encoder model 5230.
  • first encoder model 5210 can be configured to receive training reference image 501C as input data
  • second encoder model 5220 can be configured to receive training non-defective image 502C corresponding to training reference image 501C as input data
  • third encoder model 5230 can be configured to receive training defective image 503C corresponding to training reference image 501C as input data.
  • first encoder model 5210, second encoder model 5220, and third encoder model 5230 of FIG. 5C can be identical networks.
  • first encoder model 5210, second encoder model 5220, and third encoder model 5230 of FIG. 5C can share weights with each other, and any of them can be utilized as any of first encoder 3210 and second encoder 3220 of FIG. 3B after training of encoder model 521C is complete.
  • first encoder model 5210 of FIG. 5C can be utilized as first encoder 3210 of FIG. 3B, and any of second encoder model 5220 and third encoder model 5230 of FIG. 5C can be utilized as second encoder 3220 of FIG. 3B after training of encoder model 521C is complete.
  • model trainer 520C can train encoder model 521C using a plurality of sets of training images including training reference images 501C, training non-defective images 502C, and training defective images 503C.
  • the objective of model trainer 520C is to maximize similarity between two feature vectors generated by first encoder model 5210 and second encoder model 5220; and to minimize similarity between two feature vectors generated by first encoder model 5210 and third encoder model 5230.
  • model trainer 520C can train encoder model 521C using a triplet loss function 532C, which can be expressed as below:
  • Ltd represents a triplet loss function value
  • di represents the L2-norm distance between the two feature vectors generated by first encoder model 5210 and second encoder model 5220 of FIG. 5C
  • ch represents the L2-norm distance between the two feature vectors generated by first encoder model 5210 and third encoder model 5230 of FIG. 5C
  • £ represents a marginal distance to classify a defective image and a non-defective image.
  • model trainer 520C can train encoder model 521C to make loss function value L tri closer to 0.
  • model trainer 520 is configured to update or adjust weights or parameters of encoder model 521C based on loss function value L t n.
  • training encoder model 521C with informative training images can improve classification performance of encoder model 521C after training is complete.
  • a training image set can be selected to challenge triplet loss function 532C during training such that encoder model 521C can be trained to extract feature vectors that can enable to discern defective images and non-defective images.
  • Encoder model 521C trained using a training image set selected under criterion A may not classify defective and non-defective images as accurate as encoder model 521C trained using a training image set selected under criterion B. But training with a training image set selected under criterion B may lead to overfitting or unstable training results.
  • Criterion C could be selected as a criterion for selecting a training image set to compensate classification accuracy of criterion A while reducing instability or overfitting issues during training.
  • Margin £ can be determined considering various factors including an accuracy requirement, a classification stability, etc.
  • model trainer 520, 520B, and 520C can be implemented with a contrastive framework.
  • the contrastive framework could utilize a SNN-contrastive framework along with a corresponding loss function.
  • the SNN-contrastive framework could contain two or more identical subnetworks as encoder model 521, 521B, or 521C, depending on the loss function.
  • model trainer 520B of FIG. 5B can comprise encoder model 521B utilizing two sub-networks and can incorporate contrastive loss function 532B.
  • model trainer 520C of FIG. 5C can comprise encoder model 521C utilizing three sub-networks and can incorporate triplet loss function 532C.
  • FIG. 5A illustrates test images 610, 620, and 630 with noise, which are used as target test images.
  • FIG. 6A illustrates test images 610, 620, and 630 with noise, which are used as target test images.
  • first base image 611 contains a nondefective structure
  • second base image 621 contains a defective structure having a shrunk contact-hole defect
  • third base image 631 contains a defective structure having a missing contact-hole defect.
  • test images 610, 620, and 630 are obtained by applying Gaussian noise with a standard deviation 1.0 to base images 611, 621, and 631 to mimic real inspection images, e.g., SCPM images.
  • a contrastive CNN model which can be utilized in defect detection system 300, is trained using training data illustrated in FIG. 6C. As shown in FIG. 6A and FIG. 6C, structures that are not included in training data are used in test images 610, 620, and 630 to evaluate generalizability of the contrastive CNN model to structures the contrastive CNN model has not learned.
  • FIG. 7A and FIG. 7B are graphs illustrating defect detection experiment results according to a conventional baseline model and a contrastive CNN model according to some embodiments of the present disclosure.
  • first graph 710 illustrates an experiment result of a baseline model over a plurality of test images such as test images 610 and 630 for a shrunk contact-hole defect type.
  • the x-axis represents a defect index indicating a degree that a certain image is determined to include a defect according to the baseline model
  • the y-axis represents the number of test images corresponding a certain defect index.
  • First graph 710 shows a defect index distribution according to the baseline model for non-defective structures (e.g., first test image 610) and defective structures (e.g., third test image 630).
  • second graph 720 illustrates an experiment result of a contrastive CNN model over a plurality of test images such as test images 610 and 630 for a shrunk contact-hole defect type.
  • the x-axis represents a similarity index indicating a degree that a certain image is determined to include a defect according to the contrastive CNN model
  • the y-axis represents the number of test images corresponding a certain similarity index according to the contrastive CNN model.
  • Second graph 710 shows a similarity index distribution according to the contrastive CNN model for non-defective structures (e.g., first test image 610) and defective structures (e.g., third test image 630).
  • the contrastive CNN model shows a performance improvement greater than 30% in a defect capture rate while maintaining a nuisance rate at 10% for a contact hole shrunk defect type.
  • FIG. 7A and FIG. 7B also demonstrates the contrastive CNN model shows improved maximum accuracy by 6% compared to the baseline model. In these experiments, the maximum accuracy is measured assuming that an optimal threshold for defect determination is selected for the baseline model and the contrastive CNN model. Further, as shown in FIG. 7A and FIG.
  • the contrastive CNN model demonstrates a more distinct separation between defective structures and non-defective structures compared to the baseline model. Because there are considerable overlaps between the defective structures and non-defective structures in first graph 710, it will be noted that finding an optimal threshold for a defect determination can be challenging in the baseline model, which results in requiring more heuristic trials and errors in setting the optimal threshold. Because second graph 720 has a more distinct separation between the defective structures and non-defective structures, it will be noted that the contrastive CNN model shows a more robust defect detection performance than the baseline model. Another result of experiments conducted over a contact-hole missing defect type using test images such as first test image 610 and second test image 620 also show that the contrastive CNN model presents a more distinct separation between the defective structures and non-defective structures compared to the baseline model.
  • FIG. 8A illustrates test images 810, 820, and 830 with noise, which are used as target test images in these experiments.
  • Test images 810, 820, and 830 can also be obtained in a similar manner to the way of obtaining test images 610, 620, and 630 of FIG. 6A from synthetic base images 611, 621, and 631 of FIG. 6B.
  • FIG. 8A illustrates test images 810, 820, and 830 with noise, which are used as target test images in these experiments.
  • Test images 810, 820, and 830 can also be obtained in a similar manner to the way of obtaining test images 610, 620, and 630 of FIG. 6A from synthetic base images 611, 621, and 631 of FIG. 6B.
  • first test image 810 contains a non-defective structure
  • second test image 820 contains a defective structure having a shrunk contact-hole defect
  • third test image 830 contains a defective structure having a missing contact-hole defect.
  • a contrastive CNN model which can be utilized in defect detection system 300, and a conventional CNN model are trained using training data illustrated in FIG. 6C.
  • structures that are not included in training data are used in test images 810, 820, and 830 to evaluate generalizability of the contrastive CNN model and the CNN model to structures the contrastive CNN model has not learned.
  • FIG. 8B and FIG. 8C are graphs illustrating defect detection experiment results according to a conventional CNN model and a contrastive CNN model.
  • third graph 840 illustrates an experiment result of a conventional CNN model over a plurality of test images such as test images 801, 820, and 830.
  • the x-axis represents a defect index indicating a degree that a certain image is determined to include a defect according to the conventional CNN model
  • the y- axis represents the number of test images corresponding a certain defect index.
  • Third graph 840 shows a defect index distribution according to the conventional CNN model for non-defective structures (e.g., first test image 810) and defective structures (e.g., second and third test images 820 and 830).
  • fourth graph 850 illustrates an experiment result of a contrastive CNN model over a plurality of test images such as test images 810, 820, and 830.
  • the x-axis represents a similarity index indicating a degree that a certain image is determined to include a defect according to the contrastive CNN model
  • the y-axis represents the number of test images corresponding a certain similarity index according to the contrastive CNN model.
  • the contrastive CNN model demonstrates a more distinct separation between defective structures and non-defective structures compared to the conventional CNN model. Because there are considerable overlaps between the defective structures and non- defective structures in third graph 840, it will be noted that finding an optimal threshold for a defect determination can be challenging in the conventional CNN model, which results in unavoidable miss- classification of the defective and non-defective structures. Because fourth graph 850 has a more distinct separation between the defective structures and non-defective structures, it will be noted that the contrastive CNN model shows a more robust defect detection performance than the conventional CNN model.
  • a contrastive deep learning methodology more focuses on extracting differences and similarities between a target image and one or more reference images using feature vectors to predict whether the target image contains a defect, rather than learning specific defect structures as in the conventional CNN model.
  • the contrastive CNN model that are trained on a subset of structures can be generalized to detect defects on structures that are not included in training data during training. Therefore, according to some embodiments of the present disclosure, defect detection generalization and robustness can be improved. As shown in FIG. 8B and FIG.
  • the contrastive CNN model demonstrates a better generalization compared to the conventional CNN model because the contrastive CNN model provides a better defect capture rate and accuracy when evaluating test images 810, 820, and 830, which are not learned in training data of the contrastive CNN model and conventional CNN model.
  • the contrastive CNN model that can be utilized in defect detection system 300 of the present disclosure can be generalized better for structures that are not used during training while maintaining the same level of defect detection performance, compared to the conventional methodologies including the conventional CNN methodology or baseline methodology.
  • FIG. 9 is a process flowchart representing an example method for detecting a defect of an inspection image, consistent with embodiments of the present disclosure.
  • the steps of method 900 can be performed by a system (e.g., system 300 of FIGs. 3A-3B) executing on or otherwise using the features of a computing device, e.g., controller 109 of FIG. 1. It is appreciated that the illustrated method 900 can be altered to modify the order of steps and to include additional steps.
  • first input data including a target image and second input data including a reference image are acquired.
  • Step S910 can be performed by, for example, input generator 3100, among others.
  • first input data can include a target image and a first reference image.
  • second input data can include the first reference image and a second reference image.
  • first input data can include a target image and second input data can include one or more reference images.
  • a first feature vector for first input data and a second feature vector for second input data can be generated.
  • Step S920 can be performed by, for example, encoder 3200, among others.
  • the first feature vector and second feature vector can be generated via encoding by a contrastive neural network model.
  • the contrastive neural network model trained to extract features from first input data and second input data, which enables classification between a defective structure and a non-defective structure. The training process or system is described in detail referring to FIGs. 5A, 5B, and 5C.
  • the first feature vector and second feature vector can be a one-dimension feature vector.
  • step S930 similarity between the first feature vector and second feature vector is determined.
  • Step S930 can be performed by, for example, defect checker 3300, among others.
  • the similarity can be measured via a similarity metric, e.g., similarity metric (1).
  • step S940 whether a target image contains a defect is determined based on the measured similarity between the first feature vector and second feature vector.
  • Step S940 can be performed by, for example, defect checker 3300, among others.
  • the similarity value is less than a certain threshold value, it can be determined that a target image contains a defect.
  • the similarity value is greater than or equal to a threshold value, it can be determined that a target image does not contain a defect.
  • a non-transitory computer readable medium may be provided that stores instructions for a processor of a controller (e.g., controller 109 of FIG. 1) to carry out, among other things, image inspection, image acquisition, image processing, stage positioning, beam focusing, electric field adjustment, beam bending, condenser lens adjusting, activating charged-particle source, beam deflecting, and method 900.
  • a processor of a controller e.g., controller 109 of FIG. 1
  • non-transitory media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a Compact Disc Read Only Memory (CD-ROM), any other optical data storage medium, any physical medium with patterns of holes, a Random Access Memory (RAM), a Programmable Read Only Memory (PROM), and Erasable Programmable Read Only Memory (EPROM), a FLASH- EPROM or any other flash memory, Non-Volatile Random Access Memory (NVRAM), a cache, a register, any other memory chip or cartridge, and networked versions of the same.
  • NVRAM Non-Volatile Random Access Memory
  • a method for defect detection for a wafer inspection system comprising: acquiring a first input and a second input; generating a first feature vector for the first input and a second feature vector for the second input using a machine learning model; determining a similarity value between the first feature vector and the second feature vector; and determining whether the target inspection image contains a defect based on the similarity value.
  • acquiring the first input and the second input comprises: combining a target inspection image and a first reference image to generate the first input; and combining the first reference image and a second reference image to generate the second input.
  • the machine learning model is trained by performing operations including: acquiring a first training input and a second training input; generating a first training feature vector for the first training input and a second training feature vector for the second training input using a machine learning model; determining a training similarity value between the first training feature vector and the second training feature vector; and updating the machine learning model based on the training similarity value.
  • updating the machine learning model based on the training similarity value comprises: updating the machine learning model to increase the training similarity value between the first training feature vector and the second training feature vector when the target training image is labeled to contain a non-defective structure.
  • updating the machine learning model based on the training similarity value comprises: updating the machine learning model to decrease the training similarity value between the first training feature vector and the second training feature vector when the target training image is labeled to contain a defective structure.
  • the machine learning model is trained by performing operations including: acquiring a training reference image, a training non-defective image, and a training defective image; generating a first training feature vector for the training reference image, a second training feature vector for the training non-defective image, and a third training feature vector for the training defective image using a machine learning model; determining a first training similarity value between the first training feature vector and the second training feature vector and a second training similarity value between the first training feature vector and the third training feature vector; and updating the machine learning model based on the first and second training similarity values.
  • updating the machine learning model based on the first and second training similarity values comprises: updating the machine learning model to increase the first training similarity value and to decrease the second training similarity value.
  • a method of training a machine learning model for a wafer inspection system comprising: acquiring a first training input and a second training input; generating a first training feature vector for the first training input and a second training feature vector for the second training input using a machine learning model; determining a training similarity value between the first training feature vector and the second training feature vector; and updating the machine learning model based on the training similarity value.
  • updating the machine learning model based on the training similarity value comprises: updating the machine learning model to increase the training similarity value between the first training feature vector and the second training feature vector when the target training image is labeled to contain a non-defective structure.
  • updating the machine learning model based on the training similarity value comprises: updating the machine learning model to decrease the training similarity value between the first training feature vector and the second training feature vector when the target training image is labeled to contain a defective structure.
  • acquiring the first training input and the second training input comprises: combining the training target inspection image and a first training reference image to generate the first training input; and combining the first training reference image and a second training reference image to generate the second training input.
  • a method of training a machine learning model for a wafer inspection system comprising: acquiring a training reference image, a training non-defective image, and a training defective image; generating a first training feature vector for the training reference image, a second training feature vector for the training non-defective image, and a third training feature vector for the training defective image using a machine learning model; determining a first training similarity value between the first training feature vector and the second training feature vector and a second training similarity value between the first training feature vector and the third training feature vector; and updating the machine learning model based on the first and second training similarity values.
  • updating the machine learning model based on the first and second training similarity values comprises: updating the machine learning model to increase the first training similarity value and to decrease the second training similarity value.
  • An apparatus for defect detection for a wafer inspection system comprising: a memory storing a set of instructions; and at least one processor configured to execute the set of instructions to cause the apparatus to perform: acquiring a first input and a second input; generating a first feature vector for the first input and a second feature vector for the second input using a machine learning model; determining a similarity value between the first feature vector and the second feature vector; and determining whether the target inspection image contains a defect based on the similarity value.
  • the at least one processor in acquiring the first input and the second input, is configured to execute the set of instructions to cause the apparatus to further perform: combining a target inspection image and a first reference image to generate the first input; and combining the first reference image and a second reference image to generate the second input.
  • the machine learning model is trained by: acquiring a first training input and a second training input; generating a first training feature vector for the first training input and a second training feature vector for the second training input using a machine learning model; determining a training similarity value between the first training feature vector and the second training feature vector; and updating the machine learning model based on the training similarity value.
  • the machine learning model is trained by: acquiring a training reference image, a training non-defective image, and a training defective image; generating a first training feature vector for the training reference image, a second training feature vector for the training non-defective image, and a third training feature vector for the training defective image using a machine learning model; determining a first training similarity value between the first training feature vector and the second training feature vector and a second training similarity value between the first training feature vector and the third training feature vector; and updating the machine learning model based on the first and second training similarity values.
  • the machine learning model is trained by: updating the machine learning model to increase the first training similarity value and to decrease the second training similarity value.
  • An apparatus for training a machine learning model for a wafer inspection system comprising: a memory storing a set of instructions; and at least one processor configured to execute the set of instructions to cause the apparatus to perform: acquiring a first training input and a second training input; generating a first training feature vector for the first training input and a second training feature vector for the second training input using a machine learning model; determining a training similarity value between the first training feature vector and the second training feature vector; and updating the machine learning model based on the training similarity value.
  • the at least one processor in updating the machine learning model based on the training similarity value, is configured to execute the set of instructions to cause the apparatus to perform: updating the machine learning model to increase the training similarity value between the first training feature vector and the second training feature vector when the target training image is labeled to contain a non-defective structure.
  • the at least one processor in updating the machine learning model based on the training similarity value, is configured to execute the set of instructions to cause the apparatus to perform: updating the machine learning model to decrease the training similarity value between the first training feature vector and the second training feature vector when the target training image is labeled to contain a defective structure.
  • the at least one processor in acquiring the first training input and the second training input, is configured to execute the set of instructions to cause the apparatus to perform: combining the training target inspection image and a first training reference image to generate the first training input; and combining the first training reference image and a second training reference image to generate the second training input.
  • An apparatus for training a machine learning model for a wafer inspection system comprising: a memory storing a set of instructions; and at least one processor configured to execute the set of instructions to cause the apparatus to perform: acquiring a training reference image, a training non-defective image, and a training defective image; generating a first training feature vector for the training reference image, a second training feature vector for the training non-defective image, and a third training feature vector for the training defective image using a machine learning model; determining a first training similarity value between the first training feature vector and the second training feature vector and a second training similarity value between the first training feature vector and the third training feature vector; and updating the machine learning model based on the first and second training similarity values.
  • the at least one processor in updating the machine learning model based on the first and second training similarity values, is configured to execute the set of instructions to cause the apparatus to perform: updating the machine learning model to increase the first training similarity value and to decrease the second training similarity value.
  • a non- transitory computer readable medium that stores a set of instructions that is executable by at least on processor of a computing device to cause the computing device to perform a method for defect detection for a wafer inspection system, the method comprising: acquiring a first input and a second input; generating a first feature vector for the first input and a second feature vector for the second input using a machine learning model; determining a similarity value between the first feature vector and the second feature vector; and determining whether the target inspection image contains a defect based on the similarity value.
  • the machine learning model is trained by: updating the machine learning model to increase the training similarity value between the first training feature vector and the second training feature vector when the target training image is labeled to contain a non-defective structure.
  • the machine learning model is trained by: updating the machine learning model to decrease the training similarity value between the first training feature vector and the second training feature vector when the target training image is labeled to contain a defective structure.
  • any one of clauses 47-52 wherein the machine learning model is trained by: acquiring a training reference image, a training non-defective image, and a training defective image; generating a first training feature vector for the training reference image, a second training feature vector for the training non-defective image, and a third training feature vector for the training defective image using a machine learning model; determining a first training similarity value between the first training feature vector and the second training feature vector and a second training similarity value between the first training feature vector and the third training feature vector; and updating the machine learning model based on the first and second training similarity values.
  • a non-transitory computer readable medium that stores a set of instructions that is executable by at least on processor of a computing device to cause the computing device to perform a method of training a machine learning model for a wafer inspection system, the method comprising: acquiring a first training input and a second training input; generating a first training feature vector for the first training input and a second training feature vector for the second training input using a machine learning model; determining a training similarity value between the first training feature vector and the second training feature vector; and updating the machine learning model based on the training similarity value.
  • a non-transitory computer readable medium that stores a set of instructions that is executable by at least on processor of a computing device to cause the computing device to perform a method of training a machine learning model for a wafer inspection system, the method comprising: acquiring a training reference image, a training non-defective image, and a training defective image; generating a first training feature vector for the training reference image, a second training feature vector for the training non-defective image, and a third training feature vector for the training defective image using a machine learning model; determining a first training similarity value between the first training feature vector and the second training feature vector and a second training similarity value between the first training feature vector and the third training feature vector; and updating the machine learning model based on the first and second training similarity values.
  • Block diagrams in the figures may illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer hardware or software products according to various exemplary embodiments of the present disclosure.
  • each block in a schematic diagram may represent certain arithmetical or logical operation processing that may be implemented using hardware such as an electronic circuit.
  • Blocks may also represent a module, segment, or portion of code that comprises one or more executable instructions for implementing the specified logical functions. It should be understood that in some alternative implementations, functions indicated in a block may occur out of the order noted in the figures. For example, two blocks shown in succession may be executed or implemented substantially concurrently, or two blocks may sometimes be executed in reverse order, depending upon the functionality involved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Testing Or Measuring Of Semiconductors Or The Like (AREA)
  • Analysing Materials By The Use Of Radiation (AREA)

Abstract

L'invention divulgue des système (300) et procédé améliorés de détection de défaut pour un système d'inspection de tranche. Un procédé amélioré consiste à acquérir une première entrée (3001) et une seconde entrée (3003), à générer un premier vecteur de caractéristiques (3211) pour la première entrée (3001) et un second vecteur de caractéristiques (3222) pour la seconde entrée (3003) à l'aide d'un modèle d'apprentissage automatique (3200), à déterminer une valeur de similarité entre le premier vecteur de caractéristiques (3211) et le second vecteur de caractéristiques (3222), et à déterminer si l'image d'inspection cible contient un défaut sur la base de la valeur de similarité.
PCT/EP2024/080483 2023-11-21 2024-10-28 Apprentissage profond contrastif pour inspection de défauts Pending WO2025108661A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP23211337.3 2023-11-21
EP23211337 2023-11-21
EP24192475.2 2024-08-01
EP24192475 2024-08-01

Publications (1)

Publication Number Publication Date
WO2025108661A1 true WO2025108661A1 (fr) 2025-05-30

Family

ID=93258956

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2024/080483 Pending WO2025108661A1 (fr) 2023-11-21 2024-10-28 Apprentissage profond contrastif pour inspection de défauts

Country Status (1)

Country Link
WO (1) WO2025108661A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200327654A1 (en) * 2019-04-09 2020-10-15 Kla Corporation Learnable defect detection for semiconductor applications
CN115769255A (zh) * 2020-08-19 2023-03-07 科磊股份有限公司 扫描电子显微镜图像锚定阵列的设计

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200327654A1 (en) * 2019-04-09 2020-10-15 Kla Corporation Learnable defect detection for semiconductor applications
CN115769255A (zh) * 2020-08-19 2023-03-07 科磊股份有限公司 扫描电子显微镜图像锚定阵列的设计

Similar Documents

Publication Publication Date Title
US20250006456A1 (en) Cross-talk cancellation in multiple charged-particle beam inspection
US20250104210A1 (en) Method and system of defect detection for inspection sample based on machine learning model
US20240212131A1 (en) Improved charged particle image inspection
TWI874250B (zh) 用於產生合成缺陷影像的設備及相關的非暫時性電腦可讀媒體
US20250200719A1 (en) Method and system for reducing charging artifact in inspection image
US20240331115A1 (en) Image distortion correction in charged particle inspection
WO2025108661A1 (fr) Apprentissage profond contrastif pour inspection de défauts
US20250095116A1 (en) Image enhancement in charged particle inspection
TW202541094A (zh) 用於缺陷檢測之對比深度學習
WO2025201837A1 (fr) Amélioration de métrologie pour mesures basées sur un bord de motif
WO2025131571A1 (fr) Amélioration de précision de métrologie à l'aide d'un paramètre géométrique de référence
EP4607453A2 (fr) Élimination de charge de mesure et inférence de géométrie d'empilement par couche à l'aide de modèles de diffusion
EP4607460A2 (fr) Identification et segmentation de défauts sans données étiquetées et amélioration de qualité d'image
WO2025056262A1 (fr) Réglage de recette en temps réel pour système d'inspection
WO2025209815A1 (fr) Alignement local à la volée
TW202542643A (zh) 使用參考幾何參數之度量衡精度改進
WO2025036991A1 (fr) Systèmes et procédés de génération de plan d'échantillonnage hybride et de projection précise de perte de puce
TW202541091A (zh) 使用3d堆疊帶電粒子束檢測影像之中軸之良率關鍵缺陷偵測
WO2024213339A1 (fr) Procédé de génération de plan d'échantillonnage dynamique efficace et projection précise de perte de puce de sonde
WO2025087668A1 (fr) Systèmes et procédés de prédiction de mode de charge et de paramètres optimaux pour contraste de tension
TW202541077A (zh) 用於預測電壓對比之充電模式及最佳參數之系統及方法
WO2025242396A1 (fr) Sélection de niveau de tranche pour inspection en ligne améliorée dans la fabrication de semi-conducteurs
WO2025103678A1 (fr) Mesure d'erreur de placement de bord directe à l'aide d'un microscope à particules chargées à balayage haute tension et apprentissage automatique
TW202544949A (zh) 用於基於圖案邊緣之量測之度量衡改進
WO2024099710A1 (fr) Création de carte de probabilité de défaut dense destinée à être utilisée dans un modèle d'apprentissage machine pour inspection informatiquement guidée

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24795220

Country of ref document: EP

Kind code of ref document: A1