Memory Research


Overview:

For over two decades ago the Magnetic core has been used to implement computer memories. But, as large-scale semiconductors technology has evolved the Magnetic core memories replaced by the semiconductors memories. Semiconductors memories are cheaper, more efficient and much smaller than the Magnetic core memories.

Memories can be classified according to the accessing scheme ( Random-Access Memory and serial access " sequential "). Also they can be classified as volatile or nonvolatile memory.

Semiconductor memories are implemented by bipolar or MOS transistors , and the data can be stored either as charges in a capacitors or as states of flip-flops.

Memory devices have number of addresses inputs, data inputs, outputs and control inputs to enable the chip or to put it in read or write mode.

The number of data outputs decides the size of the memory cell in bits, while the addresses inputs decides which memory cell to be active ( to enable the cell). Usually the memory unit has a size the number of words times the number of bits in the word.
 

Memories types according to storage type:

As I said before there are two types of memories volatile and nonvolatile memories.

The contents of the volatile memory are lost when the power is turned off, while the contents of nonvolatile memory are not lost when the power is off this type like the ROM (Read Only Memory) and the Disks storage, while the first on like the RAM (Random Access Memory). Both volatile and nonvolatile memories has their operations in the computers for example the nonvolatile memories are used in computers to the programs needed to boot up the computer. While the volatile memory since its faster its used to store the programs used while the computer is running then important data are transferred to the disk to keep it there.
 
 
 

Memory types according to the access scheme:

The memory unit as early described has an addresses lines input, these inputs decides which cell should be activated (enabled).There are two types of memories that menages this addressing technique :

The first is the Random Access Memory ( As the Ram and the Rom )

The second is the serial or sequential memory ( As the tape storage).

Another type is called the Associative Memory or Content Addressable Memory (CAM) . This type is addressed by its contents so it can search for the data.
 

Associative Memory (CAM):

This type of memory is used for applications that need to search for items in table stored in memory. It is used for database applications and Communications systems.
There is no addresses in this type of memories, instead the data is stored sequence then a search algorithm is used to find the data and then retrieve it. As a result the data are identified not by its address but by its contents (so that its name). When a word is to be written to the memory no address is supplied, because the memory has the ability to determine a free cell to store the word.
Parallel search for a word or part of a word through the memory is done during the read operation.

RAM:

In the random access memory the data can be accessed by specifying the address and the time required to read the contents from a location is independent of its address. Contrarily the data can only be read out of a sequential memory in the same order in which they were originally written. So because the time varies depending on the location of the data, sequential memories are not used as main memories in computers they can be used as buffers or as a storage units as disk storage units.
 

Memory types from the computer point of view:

Main memory:

It is the memory unit that communicates directly with the CPU. In general it is a static or dynamic RAM. Portion of this memory may be ROM memory so as the computer can store important information about the configuration of the system and the start up procedures.

Auxiliary memory:

It is the set of devices that provide backup storage. The hard disks, the magnetic taps and the CD-Roms are examples of the auxiliary memory.

Cache Memory:

It is a high speed memory used to speed up the CPU access to the main memory.

Virtual Memory:

It is a concept that permits the programs to use memory larger than the existing main memory by using portion or all of the auxiliary memory. The size of the total memory that can be used depends on the memory space. Each address that points to a memory cell in the auxiliary memory is mapped into physical address from what we are calling the virtual address to physical address. When a word is needed from the virtual memory, the following process occur:

There are many mapping algorithms. One algorithm stores the addresses in a lookup table to perform the mapping operation. Another mapping algorithm divides the main and the virtual memory into pages so as to transfer the whole page between them.

Cache memory

Cache memory is defined as a dedicated bank of high-speed memory holding recently accessed data, designed to speed up subsequent access to the same data. This technology is based on the premise that programs frequently go back and re-execute the same instructions. When data is read from main system memory, a copy is also saved in the cache memory, along with the associated main memory. The cache then monitors subsequent requests for data to see if the information needed has already been stored in the cache. If the data had indeed been stored in the cache, then the data is delivered immediately to the processor while the attempt to fetch the information from main memory is aborted (or not started). If, on the other hand, the data had not been previously stored in cache then it is fetched directly from main memory and also saved (allocated) in cache for future access. This because the main memory (DRAM) is much slower than the cache memory (SRAM).

The efficiency of the cache memory is measured by quantity called hit ratio. When the CPU requests a word from the memory it checks the cache if this word is found, it is said to produce a hit. On the other hand, if this word is not found in the cache we say that a miss is occurred. The ratio of the hits to the total memory accesses (hits & misses) defines the hit ratio.

To conclude, the cache memory is a small amount of very fast (zero wait state) memory. It sits between the CPU and main memory.

Figure 1 Cache memory location.

Cache mapping techniques:

The transformation between CPU data requests and the cache send back data is called Mapping process. One of the common used mapping is called the associative mapping. This method uses the associative memory. The memory saves both the data and its address in the cache. When the CPU requests a word from the cache memory by calling its address. The cache searches for this address and return the data field that corresponds to this address if it is found in the cache, otherwise the data is requested from the main memory.

Another mapping technique is called the Direct-Mapping. In this type the cache is built from random access memory instead of CAM memory. The CPU address is divided into two parts an index and a tag. Where the cache memory has address bus space is equal to the index size. While the size of address bus of the man memory is equal to the size of the index and the tag.

Replacement algorithms:

There are two main replacement algorithms that replaces the current data in the cache with new data from the main memory. The first one uses the FIFO algorithm that replaces the oldest data in the cache by a new data from the main memory. While the other algorithm uses the least recently used (LRU), where the least recently used data is replaced by the new data.

Writing into cache (updating main memory):

Write-through cache: Data is written to main memory at the same time it is updated in the cache. This method ensures that the DMA receives the updated data.

Write-back cache: Data is only written to main memory when it is "forced out" of the cache (data is forced out of the cache if the space is needed to store other, more relevant pieces of information).

Write-through cache is simpler than write-back because data that needs to be forced out of cache can simply be overwritten (since we know that it has already been copied to main memory). In contrast, with write-back cache, a similar situation where data needs to be forced out requires the cache to initiate a main memory write of the "flushed" data. However, write-back is more efficient than write-through because data may be written many times in the cache without requiring a main memory access.

CPU caches:

Level 1 vs. Level 2 Cache

Today's PCs come with two levels of cache. A Level 1 cache (L1 cache) is an internal cache built into the CPU chip itself (1 to 32 kbytes). This cache is the fastest because it can be accessed by the internal processing components of the CPU. On the other hand, a Level 2 cache (L2 cache or secondary cache) is an external cache of SRAM chips plugged into the motherboard (128 kbytes to 1 Mbyte).

Asynchronous vs. Synchronous Cache

Two cache architectures differ from each other based on whether or not they can achieve perfect synchronization with the clock frequency of the CPU:
Asynchronous cache, as the name implies, operates at clock speeds slower (not synchronous) than the CPU, which results in increased processor wait states. Synchronous cache can furnish data as fast as the CPU asks for it because both the cache and CPU operate on the same clock speed. As a result, the cycle time for synchronous SRAM is up to 60% faster than that of asynchronous SRAM.

Synchronous pipelined burst cache is a data output architecture that prepares the next chunk of data while the previous chunk is being sent, so that no time is wasted. This technique is called pipelining. On the other hand, flow-through architecture allows data to be sent out of the cache in a steady, continuous flow. Flow-through cache provides slightly faster system performance, but the tighter timing required makes it more difficult to implement (than pipelined burst).

The cache concept is used in another places in the computer. Such as disk caches. Disk caches are used to store data recently read from or written to a disk, to speed up writes and subsequent reads. Most implementations use RAM to speed up disk accesses. The disk cache memory may be on the disk controller, part of the disk drive, or the main processor memory (and the caching done by the disk driver software or a TSR).

One of the caching algorithms that is used in the disk cache is the Look-ahead caching (or buffering) reads disk sectors ahead of those requested, on the assumption that those will soon be requested. Segmented look-ahead can store several such look-aheads. This is important for multitasking computers, which may interrupt a long sequential disk read to service another task.
 

Memory Interface:

Most memory chips have control pins, address pins and output data pins. The address pins define the size of the memory in words ( memory size = 2k where k = number of bits on the address bus). While the data pins define the size of the memory word. One of the control pins is called Chip Select. This pin is used to enable the chip by the micro-processor to read data or to write data to the memory. Non-Read Only Memories have two pins, read and write pins.


Figure 2 Block diagram of memory unit

Memory chips are composed of an array of small storage cells. Each cell stores one bit of the data. A set of these cells define a single word. So in order to read or to write a word, all these cells must be activated (by selecting them by the address bus signals) together to send their contents to the data bus. While the set of words defines the total memory size.


Figure 3 Memory Cell block diagram

One word must be selected at a time. Decoders are used in order to reduce the total number of address pins on the memory chip (without them, there must be a pin for each word in the memory).

Figure 4 Logical Construction of a memory

CPU Memory read and write operations:

When the CPU needs to read a word from the memory it places the address on the address bus, asserts the read line (since the CPU is reading data frommemory), and then reads the resulting data from the data bus:


Figure 5 Memory Read operation

On the other hand, when it needs to write to a word to the memory, places the value on the data bus, the address on the address bus, and asserts the write line (since the CPU is writing data to memory:


Figure 6 Memory write Operation

These operations are done on single word with size say 8 bits as in the 8088. But what about reading a word of size 16 bit? This problem can be solved in different ways depending on the processor. The 80x86 family deals with this problem by storing the L.O. byte of a word at the address specified and the H.O. byte at the next location. Therefore, a word consumes two consecutive memory addresses and two memory accesses.

Another was is used in the 8086, 80186, 80286, and 80386sx processors. That have a 16 bit data bus. These processors organize memory into two banks: an "even" bank and an "odd" bank and they store L.O. byte at the Even bank while the H.O. byte at the Odd bank. This produces another problem that words of 16 bit size must be saved in the memory as described and they can not take an arbitrary address. If a 16 bit word is saved at an odd address, two memory accesses must be done by the CPU to get this word. One access to read the L.O. Byte from the address ( Odd bank), then it must read the H.O. byte from the next address (Even Bank).

Although the size of the memory chips is limited, but one can expand it by connecting memory chips in series (i.e. all are connected to the same address lines and data lines). So the address bus can be increased by using an external decoder to select a single chip at a time.


Figure 7 Block diagram shows how to increase the memory size

The size of the memory word can be expanded by connecting the chips in parallel (i.e. connecting them to the same address lines to them but different data lines).

Figure 8 Block diagram show how to increase the memory word size

Ram Types:

There are mainly two types of rams the Dynamic RAM ( DRAM ) and the Static RAM ( SRAM ).
When people talk about memory in a computer, the are usually talking about DRAM.

SRAM:

This configuration of RAM is called Static because the state of the bit remains at one level or the other until deliberately changed, or power is removed.

The SRAM cell is implemented using six transistors as shown below.


Figure 9 SRAM memory cell

These transistors work as flip flop where the information are stored as the states of transistors Q1 and Q2 where the line select enables the cell and make it available for reading or writing and the R/W lines decides the type of operation to read or to write.

Q1 and Q2 works together when one of them is on the other is off. Q3 and Q4 serve as resistors and Q5 and Q6 act as enable gates. During a write operation, first the cell is selected by rising the voltage level on the select line. Then transistors Q5 and Q6 act as short circuits, so the Read / Write 1 line is applied to the gate of Q2 and the Read / Write 0 line is applied to the gate of Q1. To write a 1 into the cell a 1 is placed on R/W1 and a 0 on R/W0 , this makes Q2 to be turned on and Q1 to be turned off. And to write 0 into the cell a 1 is placed on R/W0 and 0 on R/W1 . In both cases the states of the transistors remains unchanged until the next write operation. To read the information from the cell, a voltage is applied to the line select so the states of Q1 and Q2 transfers to R/W1 and R/W0.

DRAM:

The computer memory is usually referred to the DRAM since it is the main memory of the computer. It comes in varies formats (e.g. 30 pin vs. 72 pin, parity vs. non-parity, etc.). It has larger sizes than the SRAM.

As like the SRAM the DRAM chip is organized in a matrix formed by rows and columns of memory cells. The simplest type of the DRAM cell contains only one transistor and one capacitor, as shown in the figure. Whether 1 or 0 is contained in a cell is determined by whether or not there is a charge on the capacitor.


Figure 10 DRAM memory Cell

When reading is required, a row select is brought high. The activated row select line turns on the transistor Q for all cells in the selected row. This makes the refresh amplifier associated with each column to sense the voltage level on the corresponding capacitor and supplies it as a 0 or a 1. The column address enables one cell in the selected row for the output. During this operation the capacitors of the entire row loss some of their charges so the cells of the entire row are rewritten by the refresh amplifiers. A write operation is done the same except that the data input is stored in the selected cell while the other cells in the same row are refreshed.

Since the capacitor discharges through the pn-junction leakage current, this type of memory can not hold the data for a long time - as described earlier - a process called memory refresh is repeated during the operation of the system. This refresh repeated at constant time interval ( about 2 ms ) to guarantee the validation of the data.

The memory refresh process differs from the regular memory read or write operations in the following:

1. The address to the cell comes from a counter called the refresh address counter . This counter supplies the row address to the cells while the column address remains constant during each refresh operation.

2. During the refreshing process all memory chips are enabled simultaneously to reduce the number of refresh cycles.

3. Also in this cycle the data out put are put at high impedance mode.

Differences between DRAM & SRAM:

1) Standard DRAM does not have a fast-enough access time (the fastest is currently about 70 ns) , While the SRAM is faster so the DRAM is used as the main memory in the computers and the SRAM is used in the cache memory of the computers.

2) Because the DRAM capacitor loses its charge over time, and needs to have its charge refreshed at regular intervals. Thus, dynamic memories are accompanied by controller circuits to rewrite the bit and refresh the stored charge on a regular basis. But the SRAM does not need such a controller.

3) The SRAM requires four to six transistors while the DRAM is much simpler, it can be implemented with three or one transistor. So more memory cells can be put in a single chip. And the total size of the DRAM is larger than the size of SRAM.

4) Neither SRAMs nor DRAMs retain information when power is removed ( volatile ); but with battery back-up, SRAMs can store important configuration information when main power is removed because they don't require refreshing.

5) The power consumed per bit of DRAM is less than power consumed per bit for SRAM. The power dissipated per bit of DRAM is less than 0.05mW while the SRAM consumes 0.2mW.

6) The DRAM is less expensive per bit than SRAM.

7) The SRAM has the property of being "bistable" in other words, as long as a current is applied to it, it can retain its on/off state.

Dynamic RAM, in contrast, is not bistable as described earlier.

8) DRAM does not have a fast-enough access time (the fastest is currently about 70 ns) while the SRAM is faster so it is used as cache memory to reduce the accessing to the DRAM ( main memory ).
 

References:

  • Gibson,Glenn A. Microcomputers for engineers and scientists.
  • Hitachi Synchronous DRAM Application Note.
  • RAM wight Paper by Sylvia Grauer.
  • Computer system architecture by M. Mano.
  • The Intel Microprocessors by B. Brey.
  • The Art Of Assembly by Randall Hyde. (http://webster.ucr.edu/Page_asm/)
  • Internet Documents.
  •