HACKER Q&A
📣 swilliams231

What are people's experiences with knowledge graphs?


I see lots of YouTube videos and content about knowledge graphs in the context of Gen AI. Are these at all useful for personal information retrieval and organization? If so, are there any frameworks or products that you'd recommend that help construct and use knowledge graphs?


  👤 marshughes Accepted Answer ✓
Knowledge graphs are really useful for personal information retrieval and organization. They can integrate and correlate scattered information, making searches more efficient. For example, when you're looking for job - related materials, it can quickly link information like salary and job requirements. For construction, the open - source Dgraph is recommended. It has strong scalability and supports complex queries. What kind of information do you plan to manage with a knowledge graph?

👤 ldjkfkdsjnv
I have worked on a large scale real time in production knowledge graph at FAANG. Runs a major service everyone has used. Personal opinion is that they are outdated. They are brittle, hard to maintain, constantly changing. Technical complexity like you wouldnt believe. A moving target in the real world.

I think as a paradigm, you should just pass everything that might be relevant into the context of an LLM, as opposed to traversing a knowledge graph.


👤 tough
as a non expert but interested on the field of AI. / LLM's, it feels -intutively- that the symbolic reasoning layer a KG and ontologies bring, is the only way to ground down into -truths- the hallucinations of LLM's, if regular LLM's are just auto-complete, how do you give them the capability to reason about the world, and environment they operate in, how do you adscribe -truth- to certain wrods, or entities, most can be solved by giving it the specific context, but in the specific area of autonomous -super intelligence- one would expect for the system to be able to gather, construct ,and expand on suck knowledge graph by itself.

recent paper that has AI continiously rebuilding the KG's, for example: https://arxiv.org/abs/2502.13025


👤 westurner
Property graphs don't specify schema.

Is it Shape.color or Shape.coleur, feet or meters?

RDF has URIs for predicates (attributes). RDFS specifies :Class(es) with :Property's, which are identified by URIs.

E.g. Wikidata has schema; forms with validation. Dbpedia is Wikipedia infoboxes regularly extracted to RDF.

Google acquired metaweb freebase years ago, launched a Knowledge Graph product, and these days supports Structured Data search cards in microdata, RDFa, and JSONLD.

[LLM] NN topology is sort of a schema.

Linked Data standards for data validation include RDFS and SHACL. JSON schema is far more widely implemented.

RDFa is "RDF in HTML attributes".

How much more schema does the application need beyond [WikiWord] auto-linkified edges? What about typed edges with attributes other than href and anchor text?

AtomSpace is an in-memory hypergraph with schema to support graph rewriting specifically for reasoning and inference.

There are ORMs for graph databases. Just like SQL, how much of the query and report can be done be the server without processing every SELECTed row.

Query languages for graphs: SQL, SPARQL, SPARQLstar, GraphQL, Cypher, Gremlin.

Object-attribute level permissions are for the application to implement and enforce. Per-cell keys and visibility are native db features of e.g. Accumulo, but to implement the same with e.g. Postgres every application that is a database client is on scout's honor to also enforce object-attribute access control lists.

And then identity; which user with which (sovereign or granted) cryptographic key can add dated named graphs that mutate which data in the database.

So, property graphs eventually need schema and data validation.

markmap.js.org is a simple app to visualize a markdown document with headings and/or list items as a mindmap; but unlike Freemind, there's no way to add edges that make the tree a cyclic graph.

Cyclic graphs require different traversal algorithms. For example, Python will raise MaxRecursionError when encountering a graph cycle without a visited node list, but a stack-based traversal of a cyclic graph will not halt without e.g. a visited node list to detect cycles, though a valid graph path may contain cycles (and there is feedback in so many general systems)

YAML-LD is JSON-LD in YAML.

JSON-LD as a templated output is easier than writing a (relatively slow) native RDF application and re-solving for what SQL ORM web frameworks already do.

There are specs for cryptographically signing RDF such that the signature matches regardless of the graph representation.