Will Percey — Portfolio

Knowledge Graphs

> > Updated Dec 2025

hub

Knowledge Graph Fundamentals

Graph Structure

Knowledge graphs represent information as nodes (entities) connected by edges (relationships). Each edge can have a type and direction. Properties store attributes on both nodes and edges, enabling rich data modeling beyond simple connections.

Key Features

Nodes: Entities (people, places, concepts)
Edges: Relationships between entities
Properties: Attributes on nodes/edges
Directed or undirected connections
Labels/types for categorization

Similar Technologies

Relational DBDocument DBVector DB

Triples (Subject-Predicate-Object)

The atomic unit of knowledge representation. 'Alice knows Bob' becomes (Alice, knows, Bob). Triples can be combined to represent complex knowledge. Foundation of RDF and semantic web standards.

Key Features

Subject: The entity being described
Predicate: The relationship type
Object: Target entity or literal value
Composable into complex graphs
Machine-readable knowledge

Similar Technologies

Property GraphsHypergraphsLabeled Graphs

Ontologies

Formal specifications defining concepts, relationships, and constraints in a domain. Ontologies enable reasoning, inference, and semantic interoperability. Range from simple taxonomies to complex formal logic.

Key Features

Class hierarchies (is-a relationships)
Property definitions and constraints
Domain and range specifications
Inference rules
Cross-domain integration

Similar Technologies

TaxonomiesSchemasData Dictionaries

Graph Traversal

Navigate the graph by following edges from node to node. Enables multi-hop queries, pathfinding, and pattern matching. Traversal algorithms like BFS/DFS power recommendation and fraud detection systems.

Key Features

Multi-hop queries
Shortest path algorithms
Pattern matching
Neighborhood exploration
Subgraph extraction

Similar Technologies

SQL JoinsMap-ReduceIndex Lookups

database

Graph Databases

Database	Type	Query Language	Best For	Managed Options
Neo4j	Native Graph	Cypher	General purpose, fraud detection, recommendations	Neo4j Aura, Self-hosted
Amazon Neptune	Native Graph	Gremlin, SPARQL, openCypher	AWS ecosystem, RDF/Property graphs	Fully managed (AWS)
Azure Cosmos DB	Multi-model	Gremlin	Azure ecosystem, global distribution	Fully managed (Azure)
TigerGraph	Native Graph	GSQL	Deep link analytics, real-time ML	TigerGraph Cloud
JanusGraph	Native Graph	Gremlin	Scalable, open-source, pluggable backends	Self-hosted, IBM Compose
ArangoDB	Multi-model	AQL	Document + Graph hybrid, flexibility	ArangoDB Oasis
Dgraph	Native Graph	GraphQL, DQL	GraphQL-native, horizontal scaling	Dgraph Cloud

code

Query Languages

Cypher

Neo4j's declarative graph query language. Pattern-based syntax using ASCII art for intuitive graph patterns. Most popular property graph query language, now standardized as openCypher.

Key Features

ASCII-art pattern matching: (a)-[r]->(b)
Declarative and readable
MATCH, CREATE, MERGE operations
Aggregations and filtering
openCypher standardization

Similar Technologies

GremlinSPARQLGraphQL

SPARQL

W3C standard query language for RDF graphs. Pattern matching against triples with powerful federation and reasoning capabilities. Essential for semantic web and linked data applications.

Key Features

Triple pattern matching
Federated queries across endpoints
CONSTRUCT for graph creation
Inference support (RDFS/OWL)
Standard for RDF databases

Similar Technologies

CypherGremlinSQL with graph extensions

Gremlin

Apache TinkerPop's graph traversal language. Functional, step-based approach to navigating graphs. Supported by many graph databases including Neptune, JanusGraph, and Cosmos DB.

Key Features

Traversal-based queries
Functional composition
Turing-complete language
Wide database support
Imperative style

Similar Technologies

CypherSPARQLGraphQL

auto_awesome

GraphRAG: Knowledge Graphs + LLMs

Entity Extraction & Linking

Use LLMs to extract entities and relationships from unstructured text, then link them to existing knowledge graph nodes. Enables automatic knowledge graph construction and enrichment from documents.

Key Features

Named Entity Recognition (NER)
Relationship extraction
Entity disambiguation
Link to existing graph nodes
Incremental graph building

Similar Technologies

Manual curationRule-based extractionspaCy NER

Graph-Augmented Retrieval

Combine vector similarity search with graph traversal. Find relevant documents via embeddings, then traverse the knowledge graph to find connected context. Richer context than pure vector RAG.

Key Features

Vector search for initial retrieval
Graph traversal for context expansion
Multi-hop relationship discovery
Structured + unstructured fusion
Better for complex queries

Similar Technologies

Pure Vector RAGKeyword SearchHybrid Search

Community Detection

Identify clusters of related entities in the knowledge graph. Use community summaries for high-level context in RAG. Microsoft's GraphRAG uses this for hierarchical summarization.

Key Features

Leiden/Louvain clustering
Community summarization
Hierarchical abstraction
Global query answering
Theme identification

Similar Technologies

Document summarizationTopic modelingClustering

Knowledge Graph QA

Convert natural language questions to graph queries. LLM generates Cypher/SPARQL from user questions, executes against knowledge graph, and formats results. Precise answers from structured data.

Key Features

Natural language to Cypher/SPARQL
Schema-aware generation
Query validation
Result formatting
Explainable answers

Similar Technologies

Text2SQLSemantic parsingNeural QA

schema

Ontology & Schema Design

Standard	Description	Use Case	Complexity
RDF (Resource Description Framework)	W3C standard for representing data as subject-predicate-object triples	Semantic web, linked data, interoperability	Medium
OWL (Web Ontology Language)	Expressive ontology language built on RDF for complex reasoning	Formal reasoning, inference, domain modeling	High
RDFS (RDF Schema)	Lightweight schema vocabulary for RDF class/property hierarchies	Simple taxonomies, basic inference	Low
Property Graph Model	Nodes and edges with properties (key-value pairs)	Application data, flexible schemas	Low
Schema.org	Shared vocabulary for structured data on web pages	SEO, web data extraction, common entities	Low

link

Entity Resolution & Linking

Deduplication

Identify and merge duplicate entities that refer to the same real-world object. Use string similarity, embeddings, and rule-based matching. Critical for data quality in knowledge graphs.

Key Features

String similarity (Levenshtein, Jaro-Winkler)
Embedding-based matching
Blocking for scalability
Merge strategies
Conflict resolution

Similar Technologies

Manual reviewRule-based matchingML classification

Entity Linking

Connect mentions in text to entities in a knowledge base (e.g., Wikipedia, Wikidata). Disambiguation based on context. Essential for building knowledge from unstructured sources.

Key Features

Candidate generation
Context-based disambiguation
Wikidata/DBpedia linking
NIL detection (new entities)
Cross-lingual linking

Similar Technologies

Named Entity RecognitionCoreference Resolution

Canonicalization

Establish canonical (preferred) forms for entities and relationships. Handle aliases, abbreviations, and alternative names. Enables consistent querying and data integration.

Key Features

Primary identifier selection
Alias management
Preferred label handling
Cross-reference maintenance
URI/IRI standards

Similar Technologies

Synonym listsThesauriNormalization

build

Knowledge Graph Tools

Microsoft GraphRAG

Microsoft's open-source implementation of graph-based RAG. Builds knowledge graphs from documents using LLMs, performs community detection, and enables both local and global queries.

Key Features

Automatic graph construction
Community detection & summarization
Local + global query modes
Hierarchical indexing
Open source (Python)

Similar Technologies

LlamaIndex KGLangChain GraphCustom implementations

LlamaIndex Knowledge Graph

LlamaIndex's knowledge graph index for RAG. Extracts triples from documents, stores in graph, and combines graph traversal with vector retrieval for enhanced context.

Key Features

Triple extraction from docs
Multiple graph store backends
Hybrid retrieval
Natural language querying
LlamaIndex integration

Similar Technologies

Microsoft GraphRAGLangChainCustom pipelines

Protégé

Stanford's open-source ontology editor. Visual interface for creating OWL ontologies. Industry standard for ontology development with reasoning and visualization capabilities.

Key Features

Visual ontology editing
OWL 2 support
Reasoner integration (HermiT, Pellet)
Plugin ecosystem
Collaborative editing

Similar Technologies

TopBraidPoolPartyWebProtégé

spaCy + EntityLinker

NLP library with entity linking capabilities. Extract entities from text and link to knowledge bases. Foundation for building knowledge graph pipelines from unstructured data.

Key Features

Named Entity Recognition
Entity linking to Wikidata/custom KB
Relation extraction (via extensions)
Fast processing
Python ecosystem

Similar Technologies

FlairStanford NERHugging Face NER

RDFLib

Python library for working with RDF. Parse, serialize, and query RDF data. Build knowledge graphs programmatically with support for multiple serialization formats.

Key Features

RDF parsing/serialization
SPARQL queries
Multiple formats (Turtle, N-Triples, JSON-LD)
Graph operations
OWL-RL inference

Similar Technologies

Apache JenaOxigraphRDF4J

NetworkX

Python library for graph analysis. Not a database, but essential for graph algorithms, analysis, and visualization. Useful for prototyping and analyzing knowledge graph structure.

Key Features

Graph algorithms (centrality, paths)
Community detection
Visualization integration
In-memory processing
Scientific computing

Similar Technologies

igraphgraph-toolSNAP