Our ECE Florida Spring Seminar Series kicks off. Please join us with Dr. Alina Zare. Learn how machine learning can overcome inaccurate training data…

Title: Multiple Instance Learning Approaches for Target Characterization

Thursday, Jan. 7 at 1pm in Larsen Hall, Rm 234


Most supervised machine learning algorithms assume that each training data point is paired with an accurate

 training label (for classification) or value (for regression). However, obtaining accurate training label information is often time consuming and expensive, making it infeasible for large data sets, or may simply be impossible to provide given the physics

 of the problem.  Furthermore, human annotators may be inconsistent when labeling a data set, providing inherently imprecise label information. Given this, in many applications, one has access only to inaccurately labeled training data.  For example, consider

 the case of single-pixel or sub-pixel target detection within remotely sensed imagery, often only GPS coordinates for targets of interest are available with an accuracy ranging across several pixels.  Thus, the specific pixels that correspond to target are

 unknown (even with the GPS ground-truth information).  Training an accurate classifier or learning a representative target signature from this sort of uncertainly labeled training data is extremely difficult in practice.  In this example, accurately labeled

 training is unavailable and an approach that can learn from uncertain training labels, such as Multiple Instance Learning (MIL) methods, is required.  Once we learn to spot it, we find this challenge of needing to learn from weakly labeled data or uncertain

 training labels plagues many potential machine learning and pattern recognition applications.  


MIL is a variation on supervised learning for problems with imprecise label information. In particular,

 training data is segmented into positively and negatively labeled bags.  In the case of target characterization, the multiple instance learning problem requires that a positive bag must contain at least one instance from the target class and negatively labeled

 bags are composed of entirely non-target data. Given training data of this form, the overall goal can be to predict either unknown instance-level or unknown bag-level labels on test data. MIL methods are effective for developing classifiers for cases where

 accurate single-instance-level labeled training data is unavailable.  Since the introduction of the MIL framework, many methods have been proposed and developed in the literature. The majority of MIL approaches focus on learning a classification decision boundary

 to distinguish between positive and negative instances/bags from the ambiguously labeled data.  Although these approaches are effective at training classifiers given imprecise labels, they generally do not provide an intuitive description or representative

 target concept that characterizes the salient and discriminative features of the target class.  The Functions of Multiple Instances (FUMI) approach is one of the few MIL methods that can estimate a representative target concept.  In addition the FUMI methods

 can address the case of target signature variability.  In this presentation, an introduction to the Function of Multiple Instances (FUMI) approach will be provided along with a description of several FUMI-based algorithms and results on a variety of data types

 and applications.


Bio:  Alina Zare (PhD, University of Florida) conducts

 research and teaches in the area of pattern recognition and machine learning in the Electrical and Computer Engineering Department at the University of Missouri. Dr. Zare’s research interests include automated analysis of large data sets from a variety of

 sensors including multi- and hyperspectral imagery, synthetic aperture sonar (SAS), LIDAR, wide band electromagnetic induction (WEMI) data, and ground penetrating radar.  Dr. Zare’s current research work includes applications in landmine and explosive hazard

 detection, sub-pixel target characterization and detection, underwater scene understanding, and plant root imaging and analysis. Dr. Zare is a recipient of the prestigious National Science Foundation CAREER award for her research on “Supervised Learning with


 Incomplete and Uncertain Data” and well as the National Geospatial Intelligence Agency’s New Investigator Program award for her research in “Functions of Multiple Instances for Hyperspectral Analysis.”