Intro to Processes

Components:
	- object program instructions (T) (program text in UNIX)
	- program data (D) (static variables)
	- stack (S) (dynamic variables, function params, etc.)
	- execution context of program: 
		- CPU registers
		- memory usage (valid addresses) 
			-- what happens if process tries to go beyond?
		- resource usage (eg. files, devices)
		- priority, status (blocked, running, etc.)

- OS maintains descriptor (process control block) for each process
- descriptor created when process created
- descriptor destroyed when process terminated (sort of)

    |---------------------------------|
    v  proc B         v-----------|   | process descriptor table (PCBs)
----------------------------------|---|---------------
 | S | D | T |     |S|D|T|     | | | | | |       | |      MAIN MEMORY
--------------------------------------^-----------|---
 ^        ^  ^     proc A             |-----------| current process pointer
 |        |  |
----------------
|H        P  L |   Proccess registers
|i        C  o | 
----------------


Process Creation: Theory

- primitive operations postulated by Conway: fork, join, quit
- forked process shares same address space (unlike real systems, eg. UNIX)
- quit terminates a process
- join(arg) synchronizes processes:
	if (--arg != 0) quit

Classical Interprocess Communication 

     Parent process:
                              Child process:
           A;
           fork(L);               L: C
           B;                        join(V);
           await-join(V);
           D;

- fork gives control to L in child and next statement in parent
- in UNIX:

     A;
     if (fork() == 0) {
        B;               /* child */
        exit();          /* terminate child, signal parent */
     } else {
        C;               /* parent */
        wait();          /* await signal from child */
     }
     D;

- synchronization example: 
	process A() {			process B() {
	   while (1) {			   while (1) {
		compute A1;			read x;
		write x;			compute B1;
		compute A2;			write y;
		read y;				compute B2;
	   }				   }
	}				}


UNIX process-related system calls


use ps command (ps -aux) to see list of processes in system

Process Creation in UNIX:

	fork(2):	#include 
			pid_t fork(void);

- creates new process (child)
- child image is an identical copy of parent
- parent and child do not share address space, have different PCB and PID
- files open in parent remain open in child
- in parent, fork returns with child_pid value
- in child, fork returns with 0

	#include 
	#include 
	#include 

	pid_t childpid;

	childpid = fork();
	switch (childpid) {
		case -1:
			fprintf (stderr, "error: %s\n", sys_errlist[errno]);
			exit(1);
			break;
		case 0:
			/* child does its thing */
			break;
		default:
			/* parent does its thing */
			break;
	}

- more portable (to non-UNIX platforms) error reporting than sys_errlist
	perror():

- child can determine its pid and parent's pid using
	getpid(2):	
	getppid(2):

- parent must often coordinate its activity with children
  (e.g. exchanging messages)
- simplest coordination is to synchronize with children's termination:
	exit(3):	void exit(int status);
			/* returns lower 8 bits of status to waiting parent */
			/* convention is to exit on 0 if correct termination */

	wait(2):	pid_t wait (int *status);
			/* allows parent to sleep until one of children exits */
			/* returns exit status of that child */

- two additional functions of interest:

	atexit(3):	int atexit(void (*func)(void));	
			/* called on normal termination */

	waitpid(2):	#include 
			#include 
			pid_t waitpid (pid_t pid, int *status, int options);
			/* POSIX extension: allows parent to specify child */
			/* pid == -1: waits for all (any) children */
			/* pid > 0: waits for specified child */
			/* options: WNOHANG (only relevant val) = don't block */

- whole point of forking a child is generally to execute a different task
- therefore, need to replace the original process's image with a new one
- stack is also replaced so no return is possible (except if OS fails to exec)

Executing of programs inside a process in UNIX: 
	exec(2):	execl, execv, execle, execve, execlp, execle

- new program executed inside process that calls exec
- same PID, open file descriptors, etc.
- simple form:
	int execl (const char *path, const char *arg0, ... [(char *)0]);

- example: want to rename grades file as a backup (% mv grades grades.bak)
- use child to do the dirty work:
	if (fork() == 0) {
		execl ("/bin/mv", "mv", "grades", "grades.bak", (char *)0);
		perror ("execl");	/* if we're here, execl failed */
		exit(1);
	}

- simpler way to do this is to use the path-aware version:
	int execlp (const char filename, const char *arg0, ... [(char *)0]);
	e.g. execlp ("mv", "mv", "grades", "grades.bak", (char *)0);
- the path-aware form offers advantage
	- looks for program in all directories specified by PATH variable
	- if found but not a binary executable (based on magic-number)
	  assumes it's shell script so fork-execs /bin/sh and feeds it arg list

	Aside -- Shell scripts:
	- default interpreter is Bourne shell 
	- can specify another interpreter if first line of file contains:
		#! 
	- interpreter is executed with argv[]

	- examples:
		#!/bin/sh
		echo foo

		#!/bin/perl
		print "Hello\n";

- can also use environment-aware form:
	int execle (const char *path, const char *arg0, ... 
		[(char *)0, char *envp[]]);
	extern char **environ;
	char *getenv(const char *name);
- environment strings look like:
	HOME=/users/jer
	PATH=/bin:/usr/bin
	HOSTNAME=vivaldi
	...

- what if argument list (length of) not known at compile time?
- e.g. need to move a series of grades files into backup directory
	% mv grades.jan grades.feb grades.march backup/
- could do it like this:
	for (i = 1; i < argc; i++)
		execl ("/bin/mv", "mv", argv[i], argv[argc]);
- why not? [ exec won't return! need to fork-exec ]
- obviously very inefficient
- also won't work if we need to specify extra parameters (e.g. recursive flag)

- solution: 3 more exec flavours:
	int execv (const char *path, char *argv[]);
	int execvp (const char *filename, char *argv[]);
	int execve (const char *path, char *argv[], char *envp[]);
-> Note: limits on arg list size (4096 bytes in POSIX)