Computer Organization and Architecture
Preface
Prerequisites
Learning ethics
Introduction
What is 🏗️Computer Organization and Architecture ?
Computer
Rapaport (2018) provides a more explicit characterization of a computational system defined as any “physical plausible implementation of anything logically equivalent to a universal Turing machine”.
[1] N. Angius, G. Primiero, and R. Turner, “The Philosophy of Computer Science (Stanford Encyclopedia of Philosophy),” Stanford.edu, 2013. [Online]. Available: https://plato.stanford.edu/entries/computer-science/. [Accessed: 30-Aug-2021]
Examples are:
Video game console.
Consoles.
Why does 🏗️Computer Organization and Architecture matter to you?
Ecosystem
Standards, jobs, industry, roles, and research
Units
GiB or GB
Gbps
Story
https://www.youtube.com/watch?v=zXPiqk0-zDY&ab_channel=FromScratch
https://www.computerhistory.org/timeline/
Von Neumann Architecture
Harvard Architecture
Modern Architectures: Hybrids
AVR® and ARM®
Von Neumann Model? Mauchly model?
This was teamwork, then We called EDVAC Model from “First Draft of a Report on the EDVAC”, no Von Neumann Model.
Turing model
Computer components
Store program
Evans, D. C. A. (2017, October 10). Stored Program Computer. Youtube. Retrieved from https://www.youtube.com/watch?v=j3XikkuFDSk&ab_channel=DrCraigA.Evans
MIPS 32
MARS MIPS simulator - Missouri State University. (2017, December 07). Retrieved from https://courses.missouristate.edu/KenVollmar/MARS
https://gist.github.com/sanchezcarlosjr/f19edbe47ad48b93b79800a285911a81
https://godbolt.org/z/Tzoj81h6e
https://en.wikibooks.org/wiki/MIPS_Assembly/MIPS_Details
https://electronics.stackexchange.com/questions/136916/use-of-at-register-in-mips
FAQ
Why do computers use binary?
Because electricity with 10 states is not the easy way. Please try to sum two numbers with electricity and ten states. Boolean algebra. Discrete signal processing.
Nibble
Why is random access memory called random?
“Random” means that the data does not have to be stored or read in any particular sequence. Random means lacking a definite plan, purpose, or pattern. Historically called random access memory because any data word can be accessed as easily as any other.
https://www.youtube.com/watch?v=Rc6qqqFZisI&ab_channel=AdamGaweda
Computer graphics and related stuff in modern computers
Fonts are vectors when you have an operating system they are saved as files in a special location in the storage device in order to show something on the screen, you need to pass by GPU. . But, also fonts and other graphical stuff are stored in the firmware (e.g. EFI or BIOS) itself.
Fonts, UNICODE, UTF-8, ASCII, and others.
All are bits. But it is depending on who receives that word and then executes an operation or another one.
PRINTF(01001100) -> Send to Graphics Processing Unit -> PRINT A on Screen
ADD(01001100,01001100) -> Send to Proccesor -> 76 + 76 and send in register.
Levels of the answer:
Compiler, Assembler, Binary, Hardware, Formal languages.
http://wiki.icmc.usp.br/images/4/42/Aula_20_-_memoria_harris_book.pdf
Where is the ASCII table located in the memory? Which type of memory do we use to store it?
How did computers understand ASCII bit? For example, 01001100 is L, not 76. Does a computer store integers according to ASCII?
Worked examples
Computer Abstractions and Technology
Programmer’s point of view
Domain language level
Assembly language level
Operating system machine level
Instruction set architecture level (ISA level)
Microarchitecture level
Digital logic level
User’s point of view
Application software
Operating system
EFI/BIOS
Firmware
Hardware
Overflow
What is the first process? General boot process.
A typical boot process for commercial computers.
- Motherboard
- Firmware. BIOS/EFI. Bootloader. Protected mode.
- Kernel.
- User interface.
- User Applications.
So, our first problem is starting after firmware.
GDT Tutorial
Motherboard
Computer performance measurement
FLOPS
Bits
https://www.youtube.com/watch?v=jAu0oyxsP20&ab_channel=FrankStajanoExplains
Exercises and Projects
Summary
Key decisions
FAQ
Reference Notes
Instructions: Language and the computer
Subsection
Subsubsection
Exercises and Projects
Summary
Key decisions
FAQ
Reference Notes
Arithmetic for Computers
Subsection
Subsubsection
Exercises and Projects
Summary
Key decisions
FAQ
Reference Notes
The processor
Microcode
Interpreter
We know by memory hierarchy the CPU has memory, it has registers.
Datapath
Subsection
Subsubsection
Exercises and Projects
Summary
Key decisions
FAQ
Reference Notes
Large and Fast: Exploiting Memory Hierarchy
Memory hierarchy.
Flash, SRAM y EEPROM memory
When people say “memory” normally refers to RAM. However, today we have a memory hierarchy. We explain the memory layout in the RAM, register behavior, and communication between the RAM and the processor using MARS.
Word.
Bit numbering
https://www.youtube.com/watch?v=jhErugDB-34
https://www.youtube.com/watch?v=RvFRCDoj6JI
(LSB) Little Endian, (MSB) Big Endian.
LSB first.
MSB first.
Example. 0x010000…00
(32-byte one, LSB first)
The value 0x010000…00
is a hexadecimal representation of a 32-byte (or 256-bit) number where the least significant byte (LSB) comes first (little-endian format).
In little-endian format, the least significant byte is stored in the smallest address and hence comes first. So 0x01
is the least significant byte, and the rest of the bytes (31 of them, in this case) are 0x00
.
If this was a big-endian format (most significant byte first), it would represent a very large number with a value of 1
followed by 31 zero bytes.
But as a little-endian format, it represents a small number because 0x01
is the least significant byte. It means the number is just 1
in decimal.
For comparison:
- Little-endian:
0x010000...00
(LSB first) represents 1 in decimal.
- Big-endian:
0x010000...00
(MSB first) would represent a very large number (about 1.16e+77 in decimal).
Consider a 4-byte (32-bit) integer value of 1
. In hexadecimal representation, it is 0x00000001
.
- In little-endian (LSB first), this would be stored or transmitted as
0x01000000
.
- In big-endian (MSB first), this would be stored or transmitted as
0x00000001
.
Consider a 4-byte (32-bit) integer value of 254. In hexadecimal representation, it is 0x000000FE
.
- In little-endian (LSB first), this would be stored or transmitted as
0xFE000000
.
- In big-endian (MSB first), this would be stored or transmitted as
0x000000FE
.
Arrays, Structs (or classes), numbers, and chars. Offset.
List.
Type.
Construction.
Address vs Content.
Intel® 64 and IA-32 Architectures Software Developer Manuals. (2020, December 22). Retrieved from https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
Little Endian vs Big Endian
TODO: In the model, we must include the stored program, the “word” concept, the “program to process” process, an array, a list, and a tree. We must include a model with electronic.
A word is a sequence of bits where a processor can operate with them in constant time .
How many bits does a WORD contain in 32/64 bit OS respectively? (2023, January 15). Retrieved from https://stackoverflow.com/questions/5295903/how-many-bits-does-a-word-contain-in-32-64-bit-os-respectively
So, suppose we say a word of 64 bits, it means you ca
Learn MIPS Assembly in Y Minutes. (2023, January 15). Retrieved from https://learnxinyminutes.com/docs/mips
Memory Layout —RAM layout virtualization
data
text
heap
stack
Heap Memory
https://stackoverflow.com/questions/2308751/what-is-a-memory-heap
https://opendsa-server.cs.vt.edu/OpenDSA/Books/CS2/html/HeapMem.html
Memory in action
Where do you locate data in memory? What looks like a memory layout with different types?
Integer
The most basic data type is an int
. Now, we save a byte 8 bits, so the largest decimal you can save in a byte is . You are going to experiment with it .data, so you don’t need registers.
.data
constant: .byte 97
Assemble it and now you can observe the memory layout in the execute tab.
Congratulations! You are allocated a byte with the value 97 at the address 0x1001000.
Ok, maybe you are thinking that interesting happened in boundaries. That is a negative number and a considerable number —a number is out-of-range for a signed value. Managing negative numbers means you have to deal with 2 complements. It means when you allocate a byte with signed values, you can express positive numbers and you can express negative numbers; therefore, a big number happened when you have a decimal number greater than and less than 256 () —if you express a number greater than 255 your program will have strange behaviors.
127 case.
.data
var1: .byte 127
-1 case.
.data
var1: .byte -1
128 case.
.data
var1: .byte 128
Word, byte, and word alignment TODO Registers.
But, what happened if you want to save a word?
.data
var1: .word 100
var2: .word 300
var3: .word 140
var4: .word 5000
var5: .word 10000
var6: .word 20000
var7: .word 30000
var8: .word 60000
var9: .word 70000
.data
var1: .word 100
var2: .word 300
var3: .word 140
var4: .word 5000
var5: .word 10000
var6: .word 20000
var7: .word 30000
var8: .word 60000
var9: .word 70000
And array?
.data
var1: .word 1 2 3 4 5 6 7 8
You want fill out the memory
.data
var1: .word 1 2 3 4 5 6 7 8
var2: .word 1 2 3 4 5 6 7 8
var3: .word 1 2 3 4 5 6 7 8
var4: .word 1 2 3 4 5 6 7 8
var5: .word 1 2 3 4 5 6 7 8
var6: .word 1 2 3 4 5 6 7 8
var7: .word 1 2 3 4 5 6 7 8
var8: .word 1 2 3 4 5 6 7 8
var9: .word 1 2 3 4 5 6 7 8
var10: .word 1 2 3 4 5 6 7 8
var11: .word 1 2 3 4 5 6 7 8
var12: .word 1 2 3 4 5 6 7 8
var13: .word 1 2 3 4 5 6 7 8
var14: .word 1 2 3 4 5 6 7 8
var15: .word 1 2 3 4 5 6 7 8
var16: .word 1 2 3 4 5 6 7 8
And what happened if you run out of memory?
Char
.data
constant: .byte 'a'
String
.data
var1: .asciiz "Hello"
Float
Array
.data
var1: .word 0x20FF 0x100019AF '0'
Subsection
Subsubsection
Field-programmable gate array
Hardware description language
Exercises and Projects
Interleaved Memory Simulator
RAID
Summary
Key decisions
FAQ
Reference Notes
IO Operations
Subsection
Subsubsection
Exercises and Projects
Cache Time Analysis
Cache Simulator
Multitask Cache Demonstrator
Selective Victim Cache Simulator
Summary
Key decisions
FAQ
Reference Notes
Parallel Processors from Client to Cloud
Flynn's taxonomy
Duncan's taxonomy
AVX-512
Advanced Vector Extensions
Single instruction, multiple threads (SIMT)
Single instruction, multiple data (SIMD)
Subsection
Subsubsection
Exercises and Projects
Cache Time Analysis
Cache Simulator
Multitask Cache Demonstrator
Selective Victim Cache Simulator
Summary
Key decisions
FAQ
Reference Notes
The basics of Login Design
Subsection
Subsubsection
Exercises and Projects
Summary
Key decisions
FAQ
Reference Notes
Graphics and Computing GPUs
GPU
GPGPU (General-purpose computing on graphics processing units)
OpenCL
NVDIA and amd
CUDA
nvcc
opengl
vulkan
Water Cooling
TPU
NPU
Exercises and Projects
Summary
Key decisions
FAQ
Reference Notes
Mapping Control to Hardware
Subsection
Subsubsection
Exercises and Projects
Summary
Key decisions
FAQ
Reference Notes
Tutorials
Assemblers, Linkers, and the SPIM Simulator
Implementation of mathematical functions in computer (no lib)
Real Assembly. Systems based on microprocessors.
1. El universo digital del IBM PC, AT y PS/2. Ciriaco García de Celis, documentación
gratuita en la red
2. Los microprocesadores Intel. Barry B. Brey, Ed. Prentice-Hall
3. IBM PC & XT, Assembly Language. Leo. J. Scalon, Ed. Brady
4. Arquitectura, programación y diseño de sistemas basados en microprocesadores
(80x86/80186/80286). Yu-Cheng Liu y Glenn A. Gibson, Ed. Anaya
Procedures & The Stack
https://courses.cs.washington.edu/courses/cse351/16sp/lectures/06-procedures_16sp.pdf
References
At ieee dot org, B. K. k. (2020, August 25). Programmed Introduction to MIPS Assembly Language. Retrieved from https://chortle.ccsu.edu/assemblytutorial/index.html#part1
RISC Architectures
x86-64 Architectures
Create a program in an assembler that calls some modern instruction
SHA
Special extensions in modern x86 CPUs
AVX
SHA
MMX
RdRand
Building a computer from scratch
mortbopet. "VSRTL." GitHub, 27 Jan. 2023, github.com/mortbopet/VSRTL.
Computing with Register Machines by SICP.
Nöding, Florian. "How a CPU works: Bare metal C on my RISC-V toy CPU." 26 Jan. 2023, florian.noeding.com/posts/risc-v-toy-cpu/cpu-from-scratch.
Clock
Computer arithmetic with circuits in a datapath
Assembling a computer from commercial products
Discrete Optimization problem
Ram Modules
DDR4 and DDR5
CAS Latency
Modules vs different
Make sure that the motherboard supports the memory. It is recommended that memory of the same
capacity, brand, speed, and chips be used.
ECC and On-Dead ECC
Quad channel
Dual channel
SDR SDRAM (303)
CL and Latency
SDRAM
Memory frequency, also known as memory clock speed
Memory Device
Array Handle: 0x000A
Error Information Handle: 0x0013
Total Width: 64 bits
Data Width: 64 bits
Size: 32 GB
Form Factor: DIMM
Set: None
Locator: DIMM 1
Bank Locator: P0 CHANNEL A
Type: DDR4
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 3200 MT/s
Manufacturer: Kingston
Serial Number: 2D26B51C
Asset Tag: Not Specified
Part Number: KF3200C16D4/32GX
Rank: 2
Configured Memory Speed: 3200 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V
Memory Technology: DRAM
Memory Operating Mode Capability: Volatile memory
Firmware Version: Unknown
Module Manufacturer ID: Bank 2, Hex 0x98
Module Product ID: Unknown
Memory Subsystem Controller Manufacturer ID: Unknown
Memory Subsystem Controller Product ID: Unknown
Non-Volatile Size: None
Volatile Size: 32 GB
Cache Size: None
Logical Size: None
Bit Twiddling Hacks or Bit manipulation
https://graphics.stanford.edu/~seander/bithacks.html
https://homolog.us/blogs/bioinfo/2014/12/04/the-entire-world-of-bit-twiddling-hacks/
https://www.fit.vutbr.cz/~ibarina/pub/bithacks.pdf
https://news.ycombinator.com/item?id=17043082
https://web.stanford.edu/class/archive/cs/cs107/cs107.1216/lectures/03/Lecture03.pdf
https://docs.arduino.cc/learn/programming/memory-guide/
[1] S. F. Barrett and D. J. Pack, Microchip AVR® Microcontroller Primer: Programming and Interfacing, Third Edition (Synthesis Lectures on Digital Circuits and Systems), Morgan & Claypool, 2019.
[2] J. Y. Yiu, The Definitive Guide to Arm® Cortex®-M0 and Cortex-M0+ Processors, Second ed., Newnes, 2015.
[3] J. Yiu, The Definitive Guide to ARM® Cortex®-M3 and Cortex®-M4 Processors, Third ed., Newnes, 2014.
Builtin functions of GCC compiler
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
__builtin__popcount
(Hamming weight)
__builtin_parity
__builtin_clz
Worked examples
Make a program in C++ that checks if the Sentence Is Pangram with no set/map.
bool checkIfPangram(string sentence) { int i = 0; int set = 0; while(i<sentence.size() && set != 67108863) { int element = sentence[i]-97; set = set | (1 << element); // 0, 1, 11, 111, 111, 1111, .... i++; } return set == 67108863; // 11111111111111111111111111 }
Make a program in C++ that detects if a number is the power of 2 with no modulus.
bool checkIfPowerOf2(int x) { return x&1 == 0; }
Make a classic Tic Tac Toe game that works with bit-wise operators. The board should be an integer or hex too.
Algorithm for Determining Tic Tac Toe Game Over. (2022, October 01). Retrieved from https://stackoverflow.com/questions/1056316/algorithm-for-determining-tic-tac-toe-game-over
https://cs.wellesley.edu/~cs240/s17/slides/bits-handout.pdf
Getting 32 bit words out of 64-bit values in C/C++ and not worrying about endianness. (2023, January 15). Retrieved from https://stackoverflow.com/questions/2194310/getting-32-bit-words-out-of-64-bit-values-in-c-c-and-not-worrying-about-endian
Core Wars
https://vyznev.net/corewar/guide.html#intro_what
Next steps
References
References (1)
TODO
Type Space
1 ASCII char is 1 byte