Mercurial > hg > nsaunier > traffic-intelligence
comparison python/ml.py @ 526:21bdeb29f855
corrected bug in initialization of lists and loading trajectories from vissim files
| author | Nicolas Saunier <nicolas.saunier@polymtl.ca> |
|---|---|
| date | Fri, 20 Jun 2014 17:45:32 -0400 |
| parents | 727e3c529519 |
| children | 39de5c532559 |
comparison
equal
deleted
inserted
replaced
| 525:7124c7d2a663 | 526:21bdeb29f855 |
|---|---|
| 56 def kMedoids(similarityMatrix, initialCentroids = None, k = None): | 56 def kMedoids(similarityMatrix, initialCentroids = None, k = None): |
| 57 '''Algorithm that clusters any dataset based on a similarity matrix | 57 '''Algorithm that clusters any dataset based on a similarity matrix |
| 58 Either the initialCentroids or k are passed''' | 58 Either the initialCentroids or k are passed''' |
| 59 pass | 59 pass |
| 60 | 60 |
| 61 def assignCluster(data, similarFunc, initialCentroids = [], shuffleData = True): | 61 def assignCluster(data, similarFunc, initialCentroids = None, shuffleData = True): |
| 62 '''k-means algorithm with similarity function | 62 '''k-means algorithm with similarity function |
| 63 Two instances should be in the same cluster if the sameCluster function returns true for two instances. It is supposed that the average centroid of a set of instances can be computed, using the function. | 63 Two instances should be in the same cluster if the sameCluster function returns true for two instances. It is supposed that the average centroid of a set of instances can be computed, using the function. |
| 64 The number of clusters will be determined accordingly | 64 The number of clusters will be determined accordingly |
| 65 | 65 |
| 66 data: list of instances | 66 data: list of instances |
| 69 from random import shuffle | 69 from random import shuffle |
| 70 from copy import copy, deepcopy | 70 from copy import copy, deepcopy |
| 71 localdata = copy(data) # shallow copy to avoid modifying data | 71 localdata = copy(data) # shallow copy to avoid modifying data |
| 72 if shuffleData: | 72 if shuffleData: |
| 73 shuffle(localdata) | 73 shuffle(localdata) |
| 74 if initialCentroids: | 74 if initialCentroids == None: |
| 75 centroids = [Centroid(localdata[0])] | |
| 76 else: | |
| 75 centroids = deepcopy(initialCentroids) | 77 centroids = deepcopy(initialCentroids) |
| 76 else: | |
| 77 centroids = [Centroid(localdata[0])] | |
| 78 for instance in localdata[1:]: | 78 for instance in localdata[1:]: |
| 79 i = 0 | 79 i = 0 |
| 80 while i<len(centroids) and not similarFunc(centroids[i].instance, instance): | 80 while i<len(centroids) and not similarFunc(centroids[i].instance, instance): |
| 81 i += 1 | 81 i += 1 |
| 82 if i == len(centroids): | 82 if i == len(centroids): |
