Skip to main content

Data Structures and Algorithms

Definition of Data Structure

A data structure is a way of representing data for efficient processing, retrieval, organization and storage of data and relationships between them.

It is a way of representing real-life objects and their relationships in computer memory. Apart from representation data structures are also used for the storage of data, for facilitating processing and for access and communication of information.

A data structure may strictly represent only information like in procedural languages or it may also contain methods for manipulating the contained data like in object-oriented languages, in such cases the structure is also called an object.

Data structures help to make the data more human-readable and manageable. Sometimes, a specific data structure may speed up an algorithm. We prefer to use data structures that speed up algorithms while occupying the least memory.

The following represents an example of a data structure used for representing a student in C programming language.

struct Student{
char* name;
int age;
int symbolNumber;
int rollNumber;
}

Data structures are formed by grouping primitive types like int, string, float, bool etc and even other complex data structures.

Importance of Data Structure

  1. For representing complex real-life objects: Data structures are a foundation for building complex applications and organizing data in programming. The primitive data types available in computers like int, float, string etc generally can't represent complex objects in real life. Data structures help represent real-life objects by grouping a bunch of primitive data types in a single unit.

  2. For efficient access: Data structures help speed up data access. For example, we can use the binary tree data structure for efficiently inserting and retrieving data in a sorted order. Similarly, data structures like B-Tree are used by databases for indexing which provides faster access to the data.

  3. For efficient storage: By using properly normalized data structures like in relational databases such as SQL, we can reduce the redundant information stored in the memory thus decreasing the amount of storage used.

  4. For efficient processing: Some algorithms require data to be represented in a specific way for faster execution. For example, performing a sorted insert operation in a binary tree is faster than performing a sorted insert operation in an array. If an algorithm requires lots of sorted insert operations, it is better to use a binary tree rather than an array.

  5. For standardization: Data structures help establish a standard way of representing information. This helps to make communication, retrieval and storage of data much easier when multiple different platforms are involved.

Properties of Data Structure

Data structures have various properties according to which they are classified:

  1. Primitive or Derived: Some data structures such as char, int, float, boolean etc are called primitive as they can't be further divided into subparts. Similarly, data structures like string, list, array, structs etc are derived by composing such primitive data structures and hence are called derived.

  2. Sequential or Non-Sequential: In a sequential data structure, data elements are arranged one after another. Examples are list, array, queue, stack etc. On the other hand, in a non-sequential data structure, data elements are not arranged one after another. Examples are graphs, trees, hashmaps etc.

  3. Homogenous or Heterogeneous: Homogenous data structures can only contain data of a single type, eg: arrays in C++. On the other hand, heterogeneous data structures can contain data of different types, eg: arrays in Python.

  4. Static or Dynamic: Static data structures have a predetermined size which is fixed at compile time. On the other hand, dynamic data structures can change their size during the runtime.

Types of Data Structure

Following is a short description of the various types of data structures, we will go into details of each of them in the coming chapters.

  1. Primitive Data Structures: Primitive data structures are basic building blocks for complex data structures.

    1. Character: It represents a single character of the alphabet. ASCII characters are of 8-bit length while unicode characters are generally 16-bit in length.
    2. Integer: Integers are numbers without decimals. They can be 8-bit, 16-bit, 32-bit, 64-bit or 128-bit in length. In languages like Python v3, integers can be of any size.
    3. Float: Float are numbers with decimals and exponents. They are used to represent very large or very small numbers in scientific notation.
    4. Boolean: Boolean represents two values of either true or false. They can be represented with a single bit but generally are represented with 8 bits due to the structure of memory.
  2. Derived Data Structures: Derived data structures are composed of one or more primitive data types or even other derived data types.

    1. Sequential Data Structures
      1. Array: In an array, data are arranged sequentially in a contiguous block of memory.
      2. List: In list data are arranged sequentially but not in a contiguous block of memory. The sequence is determined by references pointing to the next element.
      3. Stack: A Stack is similar to an array or a list but only provides insertion and retrieval from a single end.
      4. Queue: Queue is also similar to array or list but it provides insertion from one end and retrieval from the other end.
    2. Non-Sequential Data Structures
      1. Tree: A tree is a data structure in which elements can have relations to one or more than one other element in a single direction. The relations can also have extra properties.
      2. Graph: A graph is a data structure in which elements can have relations to any other elements or itself in any direction. The relations can also have extra properties. A tree is a special case of graph.
      3. Hash map: A Hash map is a data structure in which the storage location of elements is determined via a special function known as the hashing function.