Paul Frazee 6 months ago

Neo4j has an interesting approach to handling complexity beyond the core query model - a plugin system which adds procedures. I had a query that was running out of memory. Turns out they have a batching system in one of their plugins. https://morel.us-east.host.bsky.network/xrpc/com.atproto.sync.getBlob?did=did:plc:ragtjsm2j2vknwkz3zp4oxrd&cid=bafkreiam2mzrkt72utyuniysvwutyimkqlprv6kq5kfsacxofr64v2xf3e

Paul Frazee 6 months ago

Mark Hamill and George Takei RE:

Paul Frazee 6 months ago

I've loaded up neo4j with a pretty big amount of the follow graph. Anybody got a query they want me to run?

Paul Frazee 6 months ago

Paul Frazee 7 months ago

Does it count as inventing it if you Google and find out it already exists

Paul Frazee 7 months ago

watching htop while I run neo4j commands with the same energy that other people watch sports

Paul Frazee 7 months ago

We're in business. 4.9M nodes, 348M follows, imported into neo4j in 11.5min Listing my follows takes 775ms. Listing all my followers takes 1236ms.

Paul Frazee 7 months ago

neo4j import tool has a --bad-tolerance flag, which more software should have

Paul Frazee 7 months ago

I feel differently about it now but back in ‘04 I did download a car

Paul Frazee 7 months ago

[code talk] #atproto I now have a snapshot dataset of ~4.8M car files - users that hit the new relay since it was started. I rigged up a node cluster (12 workers) that runs through the car files and dumps the follow graph into 12 different CSVs. Throughput bounces btwn 200-400ps https://morel.us-east.host.bsky.network/xrpc/com.atproto.sync.getBlob?did=did:plc:ragtjsm2j2vknwkz3zp4oxrd&cid=bafkreib3qnlh4bydtfrxmiu4f76ygmqsuzelu2m7vihnwyvfawdg4cs6cy