I. Introduction
In modern VLSI design for manufacturability (DFM), measuring the similarity among different layout designs is extremely crucial and meanwhile involved in almost all applications in the field [1], [2]. Capturing and representing the intrinsic characteristics such as topological information of a layout is the kernel to addressing the problem. Since pattern intuitively describes and summarizes two-dimensional polygon configurations in a layout design, pattern-based scheme is widely used in layout design. For example, design rule check (DRC) Plus exploiting a library of patterns to identify problematic 2D patterns, has been proven to be effective [3]. However, as integrated circuit feature sizes continue to decrease, patterning technology may have poor process margin [4]. In addition, the number of patterns increases dramatically, which brings about challenges in identifying, organizing, and carrying forward the learning of each pattern from test layout designs to mature products. On the other hand, recently, machine learning technologies have been heavily introduced into DFM. To a machine learning model, the fed features directly affect the performance of regression and prediction. Therefore, the problem how to extract characteristics from numerous patterns properly demands prompt solution. In the paper, we will exemplify two applications in computational lithography domain to go in depth on feature extraction of layouts.