Data Structures
- Data Structure can be defined as the group of data elements which provides an efficient way of storing and organising data in the computer so that it can be used efficiently.
DATA TYPE
A data type is an attribute of data which tells the compiler (or interpreter) how the programmer intends to use the data.
- Primitive: basic building block (boolean, integer, float, char etc.)
- Composite: any data type (struct, array, string etc.) composed of primitives or composite types.
- Abstract: data type that is defined by its behaviour (tuple, set, stack, queue, graph etc).
Data Structures
- A data structure is a collection of data type ‘values’ which are stored and organized in such a way that it allows for efficient access and modification.
- In some cases a data structure can become the underlying implementation for a particular data type.
- For example, composite data types are data structures that are composed of primitive data types and/or other composite types.
When we think of data structures, there are generally four forms:
- Linear: arrays, lists
- Tree: binary, heaps, space partitioning etc.
- Hash: distributed hash table, hash tree etc.
- Graphs: decision, directed, acyclic etc.
Linear
- A data structure is a collection of data type ‘values’ which are stored and organized in such a way that it allows for efficient access and modification.
- In some cases a data structure can become the underlying implementation for a particular data type.
- For example, composite data types are data structures that are composed of primitive data types and/or other composite types.
When we think of data structures, there are generally four forms:
- Linear: arrays, lists
- Tree: binary, heaps, space partitioning etc.
- Hash: distributed hash table, hash tree etc.
- Graphs: decision, directed, acyclic etc.
Linear
Array
Linked List
- A linked list is different to an array in that the order of the elements within the list are not determined by a contiguous memory allocation.
- Instead the elements of the linked list can be sporadically placed in memory due to its design, which is that each element of the list consists of two parts:
- the data
- a pointer
The data is what you’ve assigned to that element/node, whereas the pointer is a memory address reference to the next node in the list.

Also unlike an array, there is no index access. So in order to locate a specific piece of data you’ll need to traverse the entire list until you find the data you’re looking for.
Tree
Tree
The concept of a ‘tree’ in its simplest terms is to represent a hierarchical tree structure, with a root value and subtrees of children (with a parent node), represented as a set of linked nodes.

The top level node is known as the “root” and a node with no children is a “leaf”. If a node is connected to other nodes, then the preceeding node is referred to as the “parent”, and nodes following it are “child” nodes.
There are various incarnations of the basic tree structure, each with their own unique characteristics and performance considerations:
- Binary Tree
- Binary Search Tree
- Red-Black Tree
- B-tree
- Weight-balanced Tree
- Heap
- Abstract Syntax Tree
Hash Table
A hash table is a data structure which is capable of maping ‘keys’ to ‘values’, and you’ll typically find this is abstracted and enhanced with additional behaviours by many high-level programming languages such that they behave like an ‘associative array’ abstract data type.
In Python it’s called a ‘dictionary’ and has the following structure (on top of which are functions such as del, get and pop etc that can manipulate the underlying data):
table = {'name': 'foobar', 'number': 123}
The keys for the hash table are determined by way of a hash function but implementors need to be mindful of hash ‘collisions’ which can occur if the hash function isn’t able to create a distinct or unique key for the table.
There are many techniques for resolving hashing collisions, but here are two that I’ve encountered:
- Separate Chaining
- Linear Probing
Graph
A graph is an abstract data type intended to guide the implementation of a data structure following the principles of graph theory.
The data struture itself is non-linear and it consists of:
- nodes: points on the graph (also known as ‘vertices’).
- edges: lines connecting each node.
The following image demonstrates a ‘directed’ graph (notice the edges have arrows indicating the direction and flow):
Graphs are used for representing networks (both real and electronic), such as streets on a map or friends on Facebook.
When it comes to searching a graph, there are two methods:
- Breadth First Search: look at siblings.
- Depth First Search: look at children
Conclusion
The discussion of data structures (and the various high-level data types) in computing is a massive topic, and so we cannot hope to try and cover every aspect and detail due to the sheer broad scope.
That said, I hope you found some useful insights and are now able to look at your programming language of choice with a slightly clearer understanding of the possible structures that sit beneath the abstraction surface.



Comments
Post a Comment