agate
1.7.1
  • About agate
  • Installation
  • Tutorial
  • Cookbook
    • Creating tables
    • Save a table
    • Remove columns
    • Filter rows
    • Sort
    • Search
      • Exact search
      • Fuzzy search by edit distance
      • Fuzzy search by phonetic similarity
    • Standardize names and values
    • Statistics
    • Compute new values
    • Dates and times
    • Emulate SQL
    • Emulate Excel
    • Emulate R
    • Emulate underscore.js
    • Homogenize rows
    • Renaming and reordering columns
    • Transform
    • Locales
    • Rank
    • Charts
    • Lookup
    • Basics
    • Coming from other tools
    • Advanced techniques
  • Extensions
  • API
  • Contributing
  • Release process
  • License
  • Changelog
agate
  • Cookbook
  • Search
  • View page source

Search¶

Exact search¶

Find all individuals with the last_name “Groskopf”:

family = table.where(lambda r: r['last_name'] == 'Groskopf')

Fuzzy search by edit distance¶

By leveraging an existing Python library for computing the Levenshtein edit distance it is trivially easy to implement a fuzzy string search.

For example, to find all names within 2 edits of “Groskopf”:

from Levenshtein import distance

fuzzy_family = table.where(lambda r: distance(r['last_name'], 'Groskopf') <= 2)

These results will now include all those “Grosskopfs” and “Groskoffs” whose mail I am always getting.

Fuzzy search by phonetic similarity¶

By using Fuzzy to calculate phonetic similarity, it is possible to implement a fuzzy phonetic search.

For example to find all rows with first_name phonetically similar to “Catherine”:

import fuzzy

dmetaphone = fuzzy.DMetaphone(4)
phonetic_search = dmetaphone('Catherine')

def phonetic_match(r):
    return any(x in dmetaphone(r['first_name']) for x in phonetic_search)

phonetic_family = table.where(lambda r: phonetic_match(r))
Previous Next

© Copyright 2023, Christopher Groskopf.

Built with Sphinx using a theme provided by Read the Docs.