# Dope Tools

# Terms

  • Entity Resolution: Think of consolidating same products sold by different sellers housed under the same umbrella/product page. Similarity comparisons can be very useful in this.
  • Co-occurence Grouping: Also known as frequent itemset mining, association rule discovery, market basket analysis
    • Finding associations between entities based on transactions that involve them

# Similiarity functions

  • Euclidean Distance
  • Manhattan Distance
  • Jaccard Similarity
    • Overlap of nodes neighbors.
    • Jaccard similarity of sets S and T is
    • Value of 1 means complete overlap, 0 means no overlap
  • String edit distance
    • Measures how many textual transformations you need to do to transform one string to another

# Visualization Techniques

# Pre-attentively processed features

pre-attentive-processing.png

# Gestalt Psychology

Has 8 good aspects on which we perceive real world groupings

  • Proximity
  • Similarity
  • Closure
  • Symmetry
  • Common Fate
  • Continuity
  • Good Gestalt
  • Past Experience

# Color schemes

color-schemes.png

A useful site is colorbrewer