Raw Data File UploadThere should be m + 1 rows (one for each data item plus a row of measure labels) and n + 1 columns (one for each measure plus a column of vertex labels). Each data item is represented as a vertex in the resulting graph. If correlate rows to each other based on column values is chosen, each pair of vertices is connected proportional to the correlation of their data vectors. If connect nodes (one per row) to each other based on Euclidean distance is chosen, each pair of vertices is connected with edge weight based on Euclidean distance: weight = e-distance^2/t. If vertices are too "clumped" in a corner of the Laplacian embedding, changing the parameter t may help. An example file is represented below (the horizontal tab in the first row is required):
If measures are correlated and/or redundant, principal component analysis (PCA) may first be performed on the data, which normalizes and de-correlates the data before calculating edge weights. |