CSI400 Home Page

 University at Albany    CSI400 Spring 2003-4 Lectures   1 -2  Computer Science Department

Lecture 01 

  • One main entity of study in this course is the process, and the main problem about processes is concurrency, which is several processes existing at the same time, affecting each other, and what that entails. 
  • We will also study (computer) system structure, which is the organization of certain systems into various parts. One main problem about structures is "What are the purposes and functions of each part?" Why do contemporary computer systems have, say, virtual memory? (Virtual memory is one such part.) What does each part do? We will answer these questions with different levels of detail. Another main problem is "How do the parts interact?". The system design problem is "How do we invent the parts, identify needed functions, and choose which functions are provided by each part, with various goals in mind?" These goals include convenient, efficient and reliable operation of the overall system. Goals of "convenience, efficiency and reliability " are must be made more specific to be helpful at all to critics and designers of systems.
  • Two general principles in engineering:
  • Today, computer systems are roughly structured into two parts: hardware and software; the users signify the environment, often including people. The software is structured roughly into to two parts:operating system software and application programs. Such boundaries are rough and subject to change over time. For example, the "BIOS" program which is the "firmware" stored in read-only or flash memory in a standard PC is considered software by some people and hardware by others. My opinion: The boundary is determined by how hard, i.e., expensive, it is to change the part.

  • Here are some examples of college courses and their main entities, solicited from the class:
  •  

    Course Title

    Entities and issues:

    Project Management "Crashing a Project" (a method for making projects go faster), Excel (particular spreadsheet software).  The issues include how to "crash" and how to use spreadsheets.
    Guitar Playing Guitars.  The issues include concepts and skills require for playing them.
    History of Religion Religions and their beliefs, practices and historical development.  The issues include how these items were created, changed and influenced.

  • What do computers do? (1) Activate their I/O devices to achieve the system's goals. The software determines and provides crucial details of how the I/O devices behave. (2) Execute one or more programs by creating and running one or more processes.

  • What  is a process?  A process is ONE run (start to finish) of one program.  Computer science specialists must understand in detail what this means.  For our purposes, it's best to think of the program as written in assembly language.  
  • The first thing a C/C++ programmer must think about when imagining what a program does is its variables.
  • A variable is a "storage space"
  • A variable can be implemented by a physical memory location, a particular region on a particular chalkboard (illustrated by a rectangular box the instructor drew in chalk), or a few among millions of the capacitor/transistors connected configurations (eletronic circuits) on a memory chip.  I think of a variable as a (shoe) box that holds or contains a value.
  • Each variable has a value that persists until it is changed or the variable is destroyed.  Thus a variable is an entity with the capacity of having its value change.  (Each computer programming variable has a finite domain of values, no larger than 2b where b is the number of bits used to implement that variable.  However, suitable data structures composed of an indefinite number of "finite" variables like these can implement variables which have, for all practical purposes, infinite domains.)
  • The "name of a memory location" is conceptually different from the memory location (or variable) it names.  One single variable might have no name (anonymous).  On the other hand, a single variable can have several names.  In that case, we say that variable is aliased.
  • In C/C++, variables that are declared (for example, XX in int XX;) are named by strings such as XX which are called identifiers.Each identifier is bound to a variable according to somewhat complicated rules belonging to the C/C++ languages.
  • For example, within the block  { int XX; ....   }  a memory location within the stack is allocated and XX is bound to it when control flows into the block.  When control flows out of the block, this binding is destroyed.  (An optimizing compiler might generate assembly language code that doesn't use a memory location on the stack for XX, but makes the program work just as if the stack were actually used, except the program might run faster.  But still, how XX works obeys the rules of C/C++, assuming the compiler is correct.)
  • Programmers must be able to think about what the computer is doing when it executes one program.  We described what happens when the C/C++ assignment expression (X=3) and the conditional expression (X==3) are executed, and what values these two expressions have.
  • In (X=3), the value of variable X is made to be 3.  The value of the whole expression is 3, so in fact the expression (Y=(X=3)) makes the values of both X and Y be 3.
  • In (X==3) the value of X is left unchanged.  The computer retrieves the value of X , compares it to 3, and makes the value of the expression be 0 if the two compared values are different and be 1 if the two compared values are equal.
  • A single run of a single program in a single computer is called a process.
  • The "stream of activity" performed by the computer executing statements one by one, one after the other, within one process, is called a thread. The word stems from the idea of a thread (perhaps drawn through cloth by a needle) winding its way through the program code. We drew a wiggly line through the three expressions and the body of a schematic Java/C/C++ for() statement as the loop runs four times.
  • Each process, like a single performance of a musical piece, has a definite extent in time.
  • Programs are different from processes.  A program is a "static" or "dead" entity.  It is just data. Software sold in stores includes one or more program files printed on optical disks.
  • Current personal computers and many other computers such as servers or mainframes (including embedded processors) can have many processes running at the same time.  For example,  many students in the CSI310 class might have issued the g++command on the machine named acunix1.albany.eduto compile their own CSI310 assignment programs so these commands are active at the same time.  The operating system in acunix1.albany.edu helps make it happen that these separate runs of g++ do not interfere with one another, and each student gets the feeling that the computer is doing its work exclusively for him or her.
  • What about I/O, file and interprocess communication operations?  These are performed by special program operations called system calls.  A system call (such as open(), read(), write(), exit()) looks superficially like the call to a library function (such as strcpy(), atoi()).  At the assembly language level, system calls are clearly different.  
  • Lecture 02

    Synopsis: Process as the running of one assembly/machine language program.  Basic use of  read() system call required for Proj. 1.  Assembly language variables have two kinds: register and "memory" (randomly accessible addressable memory).  Assembly language instruction types: arithmetic/logical, memory access, and control (distinct types for RISC instruction set architectures such as those taught at Albany's CSI333).  Operating system copies program code as data from a file to memory in order to create a process.  A few things a process consists of: memory and register contents, file descriptor information, the program counter value.
     
  • Administrivia:  We will not cover many details of operating system history, general overviews that are written about in the textbooks.  The lectures will emphasize ideas, concepts to guide your reading of the theoretical textbook sections, plus discussion of details to support the projects.  The first reading assignment has parts that cover the current theoretical topics plus parts specific to the first project.  The first project was assigned today.  It begins with a part to write a program that reads input characters, processes them in a simple way and writes the resulting characters.  It concludes with a client/serverapplication developed from the first part.
  • It's important for class members to Raise Questions and Issues about the projects as well as any other course material.
  • PROCESS: The running of ONE program in one computer.  It must be understood to be an assembly language (or better, machine language) program!
  • Operating system: The software component of the hardware-software system we will call the computer that implements processes in cooperation with the hardware.  It is impossible to understand the organization of operating systems without understanding processes, because the purpose of an operating system is to implement processes.
  • Operating Systems are organized into: Process management, implementation and interprocess communication, memory systems, input/output and file systems.
  • Tips for Project 1:  One of things a process can do is perform (file) read and write operations:  These are examples of system calls.
  • The use of read, write, open and close is REQUIRED in project 1:Using fstreams, fread, printf, cin >>, cout <<, etc is not acceptable.
  • Synopsis of read:  (number of bytes actually copied if this number is >0) = read( file descriptor, pointer to space to read, count);
  • Use file descriptor value 0 (an integer) to read from the terminal that runs to process.
  • Space in which to read must be allocated ahead of time.  Recall how to do this:
  • char buffer[SIZE]; static or automatic (means in the stack) depending on where this declaration appears

  • (In C/C++, the expression buffer denotes the address of the first byte of the allocated array.)
  • dynamic or programmer controlled allocation, done on the heap:
  • C++ way:   char *pbuff;  pbuff = new char[SIZE];
  • C way:  char *pbuff; pbuff = malloc(SIZE);
  • Do not mix the C and C++ ways in the same program.
  • Allocated space MUST be at least as large as the maximum number of bytes requested:  count <= SIZE

  • Otherwise, there might be a disaster: your program will display mysterious bugs or crash, or a worse disaster:
  • read returns a number always <= (less than or equal) to count.
  • The meaning of read returning 0 varies with the kind of source the file descriptor describes.  For a disk file, this signifies that the position indicator is at the end of the file.  For an network socket, this signifies the connection has closed.
  • If read returns a negative value, it will be -1.  This indicates some kind of error.  The nature of the error is coded by the value of global variable errno.
  • Assembly language (better to call them machine language programs) use two kinds of data storage facilities:
    1.  Registers - a small number (32 integer registers in MIPS) of cells, each stores 32 bits.
    2.  Memory (better called Randomly Accessible Addressable Memory, or RAM)
  • Assembly language instructions generally fall into 3 categories:
    1. Data Processing or arithmetic/logical instructions

    2. R1 <- R2 + R3  operation done by add  R1, R2, R3
      (Copy the values from registers R2 and R3, make the hardware add them (binary) and copy sum into register R3)
    3. Memory access instructions

    4. R1 <- M[R2+sign extend(DISP)]  is the operation done by the load instruction:   ld  R1, DISP(R2)
      M[R2+sign extend(DISP)] <- R1 is the operation done by the  store instruction:  st  R1, DISP(R2)
    5. Control instructions:  Branches (conditional), jumps, function call instructions that save the return address somewhere.
  • These 3 categories cover the instructions of "Reduced Instruction Set Computer" (RISC) style Instruction Set Architecture computers.
  • VERY IMPORTANT:  Machine language instructions are coded by numbers stored in binary, in the RAM of the computer.
  • One job of the operating system is to copy the part of an executable file containing machine instructions into the memory, just like any other program would copy data.  Then, the operating system would direct the computer to use this data as instructions.
  • Among other things, one process consists of:
  • The values in all the REGISTERS (both explicit ones and others used to hold results of comparisons)
  • The contents of ADDRESSABLE MEMORY (instructions, data and stack):  The current values in all currently allocated locations in memory.
  • Information about which file descriptors are valid (i.e. OPEN) and what each describes, whether it be a file or a socket.
  • The value of the PROGRAM COUNTER (a special register):  The address of the next instruction to execute when the process is resumed after it is stopped.