Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

Share this Page URL

19.4 Discovering Activities > 19.4.2 Clustering Sequences into Groups of Activi... - Pg. 487

CHAPTER 19 Equation 19-5 j i (a i ) Discovering and Tracking Patterns of Interest in Security Sensor Streams 487 = 1 j |a i | e (k ) j 19.4.2. Clustering Sequences into Groups of Activities The second step of the ADM algorithm is to iden- tify pattern clusters that will represent the set of discovered activities and their instances. Specif- ically, ADM groups the set of discovered pat- terns, P , into a set of clusters, A . The resulting set of clusters represents the activities that we will model, recognize, and track. Though ADM uses a standard k-means clustering method [43], we still need to define a method for determin- ing cluster centroids and for comparing activities in order to form clusters. A number of methods have been reported in the literature for sequence clustering, such as the CLUSEQ algorithm by Yang and Wang [44] and the ROCK algorithm by Noh et al. [45]. The difference between their approach and ours is that they consider purely symbolic sequences with no features attached to |a i | k =1 The continuity of a variation, v , is then defined as the average continuity of its instances. v (a i ) is defined as in Eq. (19-6), where n a i shows the total number of instances for variation a i . Equation 19-6 v (a i ) = 1 n a i n ai j i (a i ) j =1 The continuity, g , of a general pattern, g, is defined as the weighted average continuity of its variations. g is defined according to Eq. (19-7), where the continuity for each a i is weighted by its frequency f a i and n a shows the total number of variations for general pattern a.