Today I came across this brilliant python library called leopards which allows you to do some basic but frequently used filters and aggregations on python lists. I"ve always used pandas for any quick and adhoc work with large lists or CSV files, but leopards sounds a quick and performant alternative when you don"t need to do any fancy data analysis work.
Leopards provides following filters:
eq
: equals and this default filtergt
: greater than.gte
: greater than or equal.lt
: less thanlte
: less than or equalin
: the value in a list of a tuple.- e.g.
age__in=[10,20,30]
- e.g.
contains
: contains a substring as in the example.icontains
: case-insensitivecontains
.startswith
: checks if a value starts with a query strings.istartswith
: case-insensitivestartswith
.endswith
: checks if a value ends with a query strings.iendswith
: case-insensitiveendswith
.isnull
: checks if the value matches any of NULL_VALUES which are("", ".", None, "None", "null", "NULL")
- e.g.
filter__isnull=True
orfilter__isnull=False
- e.g.
A quick example of its usage with filters:
from leopards import Q
data = [{"name":"John","age":"16"}, {"name":"Mike","age":"19"}, {"name":"Sarah","age":"21"}]
filtered = Q(data, {"name__contains":"k", "age__lt":20}) ## this returns a generator
print(list(filtered))
## [{"name": "Mike", "age": "19"}]
It also provides a few aggregation functions: Count
, Max
, Min
, Sum
, Avg
Count
for example can be used as following:
from leopards import Count
data = [{"name": "John", "age": "16"}, {"name": "Mike", "age": "19"}, {"name": "Sarah", "age": "21"},{"name":"John","age":"19"}]
output = Count(data, ["age"])
## dict_values([{"age": "16", "count": 1}, {"age": "19", "count": 2}, {"age": "21", "count": 1}])
To use leopards on CSV files and perform the above filters and aggregations:
import csv
from leopards import Count
data = csv.DictReader(open("filename.csv"))
output = Count(data, ["age"])