Primitive and Composite Structures

From

(Difference between revisions)
Jump to: navigation, search
(Records)
Line 42: Line 42:
<br>
<br>
-
<br>[[CS2|Back to Data Structures (CS2)]]
+
----
 +
{{CS2/ChapNav}}
 +
----
 +
[[Category:CS2|{{SUBPAGENAME}}]]

Revision as of 16:58, 9 April 2009

All languages start with primitive elements such as characters, floating point and fixed point numbers. While these hold data and require memory to store, they are not normally considered data structures.

Because many programming languages provide an array structure and a record structure, these are often considered primitive structures also. So, we begin with a short look at both an array and a record.

Arrays

What is an array? An array is a homogeneous collection of data that is stored sequentially in memory. An array has fast random access because of this: Since the data is homogeneous, I know the size, in bytes, of each entry in the array. Since the storage is sequential, if I know the starting address of the array, it is easy to calculate the address of any slot. Most languages provide an array as a basic data type. Most languages allow for multidimensional arrays. But an issue is whether they are truly multidimensional. That is, are “ragged” arrays allowed?


Address Calculation Example: Given A[ 4, 3 ], which has 4 rows and 3 columns. Suppose this is an array of ints, so each slot in the array takes 4 bytes. If the array is stored in column major order beginning at address 100, then the memory layout is the following.

(..need diagram..)

100 104 108 112 116 120 124 128 132 136 140 144

[0,0] [1,0] [2,0] [3,0] [0,1] [1,1] [2,1] [3,1] [0,2] [1,2] [2,2] [3,2]


To access location [2, 1] requires moving 1 column plus 2 more slots. Since a column is 4 slots or 16 bytes, then the address of [2, 1] is

100 + 1 column + 2 slots or
100 bytes + 4 slots * 4 bytes per slot + 2 slots * 4 bytes per slot or
100 bytes + 16 bytes + 8 bytes = 124 bytes. 

However, to access the entire array we would typically write

 for( int i = 0; i < 4; i ++ )  
    for( int j = 0; j < 3, j++ ) 


This gives the sequence: [0,0], [0,1], [0,2], [1,0], [1,1], [1,2], [2,1], [2,2], [3,0], [3,1], [3,2].


Is this a problem? Here is where the theoretical meets the practical. Consider a large array and paging.


Sparse arrays: A sparse array is one where more than half of it is empty. In this case is the array the best choice for the storage of the data?


Records

A record is a heterogeneous collection of data. The components of a record structure are usually referred to as fields. Many languages provide the ability to construct records. For example, a Java class that has no methods or has only accessor and mutator methods is a form of a record structure. In this case, the class instance variables would be the fields of the record.


Variant Records: A variant record is one where a single field can represent multiple data types, but only one type at a time. If you know what an overlay refers to in an operating system, a variant record is the same idea. That is, a variant record could have one field that might represent an int, or a double, or a boolean. It would have enough memory allocated to it to represent the biggest element, in this case a double, but only one component could be represented at a time.



CS2: Data Structures
Theory of Computation - ADT Preliminaries
Linear ADTs - Tree ADTs - Graph ADTs - Unordered Collection ADTs


Personal tools
MediaWiki Appliance - Powered by TurnKey Linux