cargo install collie-search
github.com/suleymanozkeskin/collie
ripgrep on a 29K-file repo (Kubernetes). Each query re-walks and re-searches candidate files.
$ rg -l
"kubelet"
$ rg -l
"context"
$ collie
search "kubelet"
Fresh run on the current build: rg still lands in the
400–880 ms range. Collie stayed under 14 ms on
the lexical sample.
grep/rg also have no understanding of code structure.
Just raw text.
A
lexical and structural search CLI that
indexes your codebase once, keeps it current
while
the daemon runs, and auto-stops when you're done.
Not a replacement for traditional search, specialized for agentic development, indexed per project / worktree.
# Start the daemon (indexes + watches for changes) $ collie watch . # Search instantly $ collie search "handler" # Stop when done (or auto-stops after 30min idle) $ collie stop .
Index once → query faster → incremental updates via FSEvents
% wildcards
$ collie search "kube%" # prefix $ collie search "%config" # suffix $ collie search "handle request" # multi-term AND
kind: lang:
path: qname:
$ collie search "kind:fn handler" $ collie search "kind:struct lang:go" $ collie search "kind:method qname:Server::run"
$ collie search -e "func\s+\w+Handler" $ collie search -e "TODO|FIXME|HACK" # multi-pattern $ collie search -e "impl.*Error" -i # case-insensitive
--symbol-regex
$ collie search "kind:fn %Handler" --symbol-regex '\*.*Server'
kind:fn|method|class|struct|enum|interface|trait variable|field|property|constant|module|type|import lang:go|rust|python|typescript|c|cpp|java|csharp|ruby|zig path:src/api/ # scope to directory qname:Server::run # qualified name
$ collie search "kind:fn lang:go path:pkg/api/ init" $ collie search "kind:struct Config" $ collie search "kind:method qname:HollowKubelet::Run" $ collie search "kind:trait path:src/ %Error%"
14 symbol kinds. 10 languages. Powered by Tree-sitter AST parsing.
include_pdfs = true
$ collie search "authentication flow" src/auth/handler.go:14 func authFlow() { docs/design-spec.pdf:3 Section 4.2: Authentication Flow rfcs/auth-v2.pdf:1 RFC: Revised authentication flow
Design specs, RFCs, API docs, research papers — indexed
alongside code.
grep and ripgrep can't do this.
Benchmark — Kubernetes repo, 28,903 files
| Query | Collie p50 | ripgrep range | Speedup |
|---|---|---|---|
kubelet |
12 ms | 744–878 ms | 63–74x |
context |
7 ms | 403–610 ms | 58–88x |
controller |
7 ms | 398–427 ms | 56–60x |
kube%
(prefix)
|
8 ms | 491–520 ms | 58–61x |
%config
(suffix)
|
8 ms | 401–422 ms | 48–51x |
%request%
(substring)
|
104 ms | 413–429 ms | 4x |
Fresh run on the current build, 5 measured runs per query. Collie stayed tight; rg still paid full-scan cost on every query.
Benchmark — Kubernetes repo
kubelet searchBenchmark — Kubernetes repo
| Query | Collie p50 | Range |
|---|---|---|
kind:fn handler |
7 ms | 6–8 ms |
kind:struct SharedInformerFactory
|
6 ms | 6–7 ms |
kind:fn path:pkg/ init |
12 ms | 11–13 ms |
kind:method
qname:HollowKubelet::Run
|
6 ms | 6–8 ms |
kind:fn validate webhook |
7 ms | 6–7 ms |
Find functions, structs,
methods — not just text matches.
Powered by Tree-sitter across 11
languages.
Benchmark — Kubernetes repo
| Pattern | Collie p50 (-n 50) | ripgrep p50 | Speedup |
|---|---|---|---|
func\s+\w+Handler |
10 ms | 392 ms | 38x |
interface\s*\{ |
9 ms | 415 ms | 46x |
context\.Context |
12 ms | 422 ms | 34x |
Collie extracts literal fragments from your regex,
narrows candidate files via the index, then applies the
full regex.
Honest framing: Collie regex is optimized for
interactive bounded queries.
For exhaustive -n 0 scans, ripgrep still
wins today.
Why this matters for agents
$ rg "func \(.*Authorization.*\) Validate" 522 ms
$ collie search "kind:method path:pkg/kubeapiserver/options/ %validate%" 72 ms
Agents already know the kind, language, and approximate
location.
Structural queries turn that knowledge
into precise results.
New capability
Narrow with structure first, refine with regex. No rg equivalent.
| Intent | Collie | rg regex | Speedup |
|---|---|---|---|
Methods on *Server ending in
Handler
|
70 ms | 406 ms | 5.8x |
Methods on Server containing
Handler
|
7 ms | 404 ms | 54x |
| Validate functions related to webhooks | 281 ms | 373 ms | 1.3x |
$ collie search "kind:fn %Handler" --symbol-regex '\*.*Server' $ collie search "kind:method qname:Server::" --symbol-regex 'Handler' $ collie search "kind:fn %validate%" --symbol-regex 'webhook|Webhook'
The more structural info in the symbol query, the faster the regex refinement.
Agentic benchmark — per-task
| Task | Symbol | Lexical | ripgrep |
|---|---|---|---|
| Find HollowKubelet::Run | 43 ms | 23 ms | 730 ms |
| Webhook token authenticator | 95 ms | 36 ms | 548 ms |
| Authorization options validator | 72 ms | 19 ms | 522 ms |
| SharedInformerFactory constructor | 111 ms | 27 ms | 398 ms |
| PodDisruptionBudget validator | 69 ms | 176 ms | 373 ms |
--format json —
structured, parseable, no regex-on-grep
hacks
0 results
1 nothing
2 error —
shell-scriptable
collie skill prints a reference
card for agent context
$ collie search "kind:fn handler" \ --format json -n 5 { "type": "symbol", "count": 5, "results": [{ "path": "pkg/api/handler.go", "line": 42, "kind": "function" }, ...] } $ echo $? 0
collie search
All Rust. Single binary. No runtime dependencies.
Go •
Rust •
Python •
TypeScript •
JavaScript
C • C++ •
Java • C# • Ruby
• Zig
Symbol extraction: functions, methods, structs, classes,
enums,
interfaces, traits, variables, fields, constants,
modules, type aliases, imports
| Files | Rebuild | Index size | Peak RAM | |
|---|---|---|---|---|
| Collie repo | 75 | 0.3s | 675 KB | 26 MB |
| Kubernetes | 28,903 | 8.7s | 391 MB | — |
One-time cost. After that,
notify
keeps the index current incrementally.
$ cargo install collie-search
$ collie watch . # start daemon, index repo $ collie status . # verify it's running
$ collie search "handleRequest" $ collie search "kind:fn path:src/ init" $ collie search -e "func\s+\w+Handler" $ collie search "kind:fn %Handler" --symbol-regex '\*.*Server' $ collie search "config" --format json # for agents
$ collie skill # prints reference card for LLM context
cargo install collie-search
github.com/suleymanozkeskin/collie