Structure

Struct and its usage in C

Written by: sami1907115

Wed, 20 Nov 2024

What is a Structure?

A structure in C is a user-defined data type that allows grouping of different data types under one name. Think of it as a container for different variables that belong together logically. It’s like creating a blueprint for an object, but instead of just one type of data, a structure can hold many kinds, such as integers, floats, and characters.

Loop, condition, data types are extensively used in CP. Have you ever used Struct in CP?
Kind of a first glimpse of development (exclusive to dev).

Why Do We Need Structures?

Let’s say you want to store information about students of a school consisting of their name, age, and GPA.

How can you solve this? Don’t think outside the box, think inside (for now). Think with your current knowledge.

Involves array.
But arrays have a single type. Then what?
Multiple arrays of each type?
Messy? Hard to manage?

Instead of creating separate arrays for each attribute (which can get messy if you have hundreds of students), you can group these attributes into a structure.

Tired of int, bool, float?

No real connection to real-world problems?
Need something that doesn’t only work with numbers, logic, or gibberish but can represent real-world objects?

Say no more! Struct is your solution. (That’s why it’s the starting block of development).

Real-life Example

Imagine you’re designing a video game. You want to represent a player with various attributes like health, experience points, and player name. A structure is perfect for this because it lets you group these attributes logically.

struct Player {
    char name[30];
    int health;
    int experience;
};

Advantages of Structures

Like functions, we have to go through 3 stages:

Declaration
Initialization
Access

Declaration

Basic Way:

struct Student {
    char name[50];
    int age;
    float GPA;
};

With variables:

struct Person { 
    char name[50]; 
    int age; 
} person1, person2;

With typedef:

typedef struct {
    char name[50];
    int age;
} Employee;

Why typedef?

In main:

Normal struct variable declaration (always need struct keyword):
```
struct Student s1;
struct Student s2;
```
With typedef (simpler):
```
Student s1, s2;
```

Initialization

Let an example be:

typedef struct Person {
    char name[50];
    int age;
    float height;
} Person;

Person p;

We can initialize a struct object:

By accessing each variable

p.name = "John Doe";
p.age = 30;
p.height = 5.9;

Direct initialization

struct Person p1 = {"John Doe", 30, 5.9};

With designated initializer

struct Person p2 = {.age = 25, .height = 6.1, .name = "Alice"};

With loop:

for (int i = 0; i < 3; i++) {
    printf("Enter details for person %d:\n", i + 1);
    printf("Name: ");
    scanf("%s", person[i].name);
    printf("Age: ");
    scanf("%d", &person[i].age);
    printf("Height: ");
    scanf("%f", &person[i].height);
}

Access

In C, there are two main operators for accessing members of a struct: the dot (.) operator and the arrow (->) operator.

Dot (`.`) Operator

The dot operator is used to access members of a structure directly when you have a structure variable.

#include <stdio.h>

struct Person {
    char name[50];
    int age;
    float height;
};

int main() {
    struct Person p1 = {"Alice", 30, 5.9};

    // Access members using the dot operator
    printf("Name: %s\n", p1.name);       // Outputs: Alice
    printf("Age: %d\n", p1.age);         // Outputs: 30
    printf("Height: %.1f\n", p1.height); // Outputs: 5.9

    return 0;
}

Arrow (`->`) Operator

The arrow operator is used when you have a pointer to a structure.

#include <stdio.h>

struct Person {
    char name[50];
    int age;
    float height;
};

int main() {
    struct Person p1 = {"Bob", 25, 6.1};
    struct Person *ptr = &p1;

    // Access members using the arrow operator
    printf("Name: %s\n", ptr->name);       // Outputs: Bob
    printf("Age: %d\n", ptr->age);         // Outputs: 25
    printf("Height: %.1f\n", ptr->height); // Outputs: 6.1

    return 0;
}

Alternative:

(*ptr).age  // Dereference the pointer and use dot operator

Structures and Arrays

Arrays of structures allow creating collections of related but varied data.

Declaration:

StructName arrayName[arraySize];

Access:

struct Student students[3] = {
    {"John", 101, 87.5},
    {"Alice", 102, 92.0},
    {"Bob", 103, 76.8}
};

printf("Name: %s\n", students[0].name);        // Outputs "John"
printf("Roll Number: %d\n", students[1].rollNumber); // Outputs "102"

// Modifying data
students[2].marks = 80.5;   // Changing Bob's marks

Common Mistakes (and how to avoid them):

Forgetting struct keyword:

typedef struct {
    int id;
    char name[30];
} Employee;

Employee emp1;  // No need for 'struct'

Incorrect Member Access:

struct Student *ptr = &s1;
printf("%s", ptr->name);  // Correct

Not Initializing Structure Members:
Always initialize your structure variables; otherwise, they hold garbage values.

Structures Inside Structures

Nesting structures is great for complex data like linked lists:

struct Date {
    int day;
    int month;
    int year;
};

struct Employee {
    char name[50];
    struct Date joiningDate;
};

Structure related Other Concepts:

Unions: Similar to structures but all members share the same memory space. They are used when you want a variable to store different types of data but only one type at a time.

  union Data {

      int i;

      float f;

      char str[20];

  };

Enums: Another user-defined data type like structures, but it’s used to define sets of named integer constants.

c enum Color {RED, GREEN, BLUE};

Structure Padding and Packing in C:

In C programming, when dealing with structs, the way memory is allocated to the structure's members may introduce padding. This padding is automatically added by the compiler to align data in memory for efficient access.

How a processor access data from memory? This vary from machine to machine. In a CPU cycle it might take 4 bytes from RAM in a single data fetch. But char (1 byte), short ints have less byte size (2 bytes), then how they are fetched in a cycle? So, we need a solution for this by defining a standard fetch size of 4 bytes in this case, meaning char will take 4 bytes now!! so that it can be fetched in a single cycle. Padding refers to the extra space that the compiler inserts between structure members to align them in memory. The alignment is based on the architecture's word size (e.g., 4 bytes for 32-bit systems, 8 bytes for 64-bit systems), which dictates the boundary on which each structure member should be stored for optimal memory access.

This process of adding padding, however, may lead to unused memory space. Understanding how padding works and how to minimize it is essential for optimizing memory usage.

Why Padding is Used:

Alignment: Certain processors access data more efficiently when data is aligned on memory boundaries that match the processor’s word size. For example, on a 32-bit system, accessing a 4-byte integer that is aligned to a 4-byte boundary is faster than if it’s misaligned.

Efficiency: Memory alignment can reduce the number of memory fetches required and can prevent performance penalties.

Example of Padding:

Consider the following structure:

struct Example {

    char a;     // 1 byte

    int b;      // 4 bytes

    char c;     // 1 byte

};

In memory, without padding, the structure would require 1 + 4 + 1 = 6 bytes. However, the compiler will introduce padding to align the int b on a 4-byte boundary, as well as to align the structure size to a multiple of the largest member (here, int). The memory layout would look like this:

Member	Size	Memory Offset
a	1	0
padding	3	1-3
B	4	4-7
C	1	8
padding	3	9-11

Thus, the structure will occupy 12 bytes instead of the expected 6 bytes.

Minimizing Padding (Improving Memory Efficiency):

There are several ways to minimize padding and improve memory efficiency in C structures:

1. Reordering Structure Members:

By carefully arranging the structure members from largest to smallest in terms of their size, padding can be minimized.

For example:

   struct OptimizedExample {

       int b;    // 4 bytes

       char a;   // 1 byte

       char c;   // 1 byte

   };

In this layout, only 2 bytes of padding will be added after char a and char c, and the total size of the structure will be 8 bytes, saving 4 bytes compared to the original structure.

2. Using #pragma pack Directive:

The #pragma pack directive can be used to control the alignment requirements of structure members. This can eliminate padding altogether by forcing the compiler to pack the structure tightly.

Example with #pragma pack:

   #pragma pack(1)  // No padding between members

   struct PackedExample {

       char a;   // 1 byte

       int b;    // 4 bytes

       char c;   // 1 byte

   };

   #pragma pack()   // Reset to default packing

In this case, the total size of the structure will be 6 bytes because no padding is added between members. However, keep in mind that this may introduce performance penalties on some architectures where misaligned data access is slower.

3. Bit Fields:

Bit fields can be used to pack data more tightly, especially for small data members that don’t require a full byte of storage.

Example:

   struct BitFieldExample {

       unsigned int a : 4;   // 4 bits

       unsigned int b : 4;   // 4 bits

   };

This structure will use only 1 byte for both a and b because both are packed into a single byte using bit fields.

Structure Packing:

Packing refers to organizing the data in memory as tightly as possible to minimize the memory footprint of a structure. While packing reduces memory usage by eliminating padding, it may result in performance penalties on some architectures due to inefficient data access (misaligned data). Packing should be used when memory is a critical resource, and performance penalties can be tolerated.

When to Use Packing:

- Embedded systems or memory-constrained environments, where saving every byte is crucial.

- When structures are sent over a network or saved to disk, and compactness is important.

Trade-offs of Packing:

Performance: Accessing misaligned data can slow down performance on some systems.

Portability: Code that uses packed structures may not be portable across architectures with different alignment requirements.

Structures in C are like the humble “lego blocks” of programming. Individually, they may not seem impressive, but when combined, they can build anything from simple programs to entire software systems. And just like lego, they’re a lot of fun once you start using them!