meter data are similar to one another are grouped into one cluster, and they are determined to have the same phase connection. x Supervised ML: In this approach, a small number of customers' phase connections is known (i.e., " labels " ). Using the labels, supervised ML algorithms can discover the relationship between smart meter data and phase connections. Based on the discovered relationship, functions are derived to determine the phase connection of unlabeled smart meters. x Physics-informed ML: In this technology, based on the physical model of three-phase power flow, a precise model is developed. In this model, given the phase connections and power measurement data, smart meter voltage data can be calculated. The phase connections are then determined by minimizing the error between the calculated voltages and true values. Table 1 summarizes the three groups of data-driven phase identification technologies. Utilities can select which technology to use based on their data availability and requirement for accuracy. Substation Feeder Feeder Customer 1 Customer 2 Transformer 1 A BC Transformer 2 Customer 4 Feeder Customer 3 ABC Customer 5 Customer 7 Transformer 3 Customer 6 Figure 1. A distribution system model with BTM resources. TABLE 1. A summary of data-driven phase identification technologies. Unsupervised ML Supervised ML Needed data and information Smart meter (voltage magnitude and power) SCADA data Smart meter (voltage magnitude) SCADA data Samples of correct phase labels for transformers/ meters Advantages Accurate phase identification results Minimum data requirement Disadvantages Accuracy is not as high as the other two methods Requires physical primary feeder model 22 IEEE Electrification Magazine / DECEMBER 2022 Higher accuracy than unsupervised ML Does not require physical primary feeder model Less accurate than physicsinformed ML Requires samples of correct phase labels Physics-Informed ML Smart meter (voltage magnitude and power) SCADA data Physical primary feeder model and locations of smart meters Highest phase identification accuracy Excellent interpretability Requires more network information than the other two methods