Today I came across this brilliant python library called leopards which allows you to do some basic but frequently used filters and aggregations on python lists. I"ve always used pandas for any quick and adhoc work with large lists or CSV files, but leopards sounds a quick and performant alternative when you don"t need to do any fancy data analysis work.

Leopards provides following filters:

  • eq: equals and this default filter
  • gt: greater than.
  • gte: greater than or equal.
  • lt: less than
  • lte: less than or equal
  • in: the value in a list of a tuple.
    • e.g. age__in=[10,20,30]
  • contains: contains a substring as in the example.
  • icontains: case-insensitive contains.
  • startswith: checks if a value starts with a query strings.
  • istartswith: case-insensitive startswith.
  • endswith: checks if a value ends with a query strings.
  • iendswith: case-insensitive endswith.
  • isnull: checks if the value matches any of NULL_VALUES which are ("", ".", None, "None", "null", "NULL")
    • e.g. filter__isnull=True or filter__isnull=False

A quick example of its usage with filters:

from leopards import Q

data = [{"name":"John","age":"16"}, {"name":"Mike","age":"19"}, {"name":"Sarah","age":"21"}]
filtered = Q(data, {"name__contains":"k", "age__lt":20}) ## this returns a generator
print(list(filtered))
## [{"name": "Mike", "age": "19"}]

It also provides a few aggregation functions: Count, Max, Min, Sum, Avg

Count for example can be used as following:

from leopards import Count

data = [{"name": "John", "age": "16"}, {"name": "Mike", "age": "19"}, {"name": "Sarah", "age": "21"},{"name":"John","age":"19"}]
output = Count(data, ["age"])
## dict_values([{"age": "16", "count": 1}, {"age": "19", "count": 2}, {"age": "21", "count": 1}])

To use leopards on CSV files and perform the above filters and aggregations:

import csv
from leopards import Count

data = csv.DictReader(open("filename.csv"))
output = Count(data, ["age"])

Comment? Reply via Email or Bluesky or Twitter.