Intro to Processes

Components:
	- object program instructions (T) (program text in UNIX)
	- program data (D) (static variables)
	- stack (S) (dynamic variables, function params, etc.)
	- execution context of program: 
		- CPU registers
		- memory usage (valid addresses) 
			-- what happens if process tries to go beyond?
		- resource usage (eg. files, devices)
		- priority, status (blocked, running, etc.)

- OS maintains descriptor (process control block) for each process
- descriptor created when process created
- descriptor destroyed when process terminated (sort of)

    |---------------------------------|
    v  proc B         v-----------|   | process descriptor table (PCBs)
----------------------------------|---|---------------
 | S | D | T |     |S|D|T|     | | | | | |       | |      MAIN MEMORY
--------------------------------------^-----------|---
 ^        ^  ^     proc A             |-----------| current process pointer
 |        |  |
----------------
|H        P  L |   Proccess registers
|i        C  o | 
----------------


Process Creation: Theory

- primitive operations postulated by Conway: fork, join, quit
- forked process shares same address space (unlike real systems, eg. UNIX)
- quit terminates a process
- join(arg) synchronizes processes:
	if (--arg != 0) quit

Classical Interprocess Communication 

     Parent process:
                              Child process:
           A;
           fork(L);               L: C
           B;                        join(V);
           await-join(V);
           D;

- fork gives control to L in child and next statement in parent
- in UNIX:

     A;
     if (fork() == 0) {
        B;               /* child */
        exit();          /* terminate child, signal parent */
     } else {
        C;               /* parent */
        wait();          /* await signal from child */
     }
     D;

- synchronization example: 
	process A() {			process B() {
	   while (1) {			   while (1) {
		compute A1;			read x;
		write x;			compute B1;
		compute A2;			write y;
		read y;				compute B2;
	   }				   }
	}				}

UNIX process-related system calls

use ps command (ps -aux) to see list of processes in system Process Creation in UNIX: fork(2): #include pid_t fork(void); - creates new process (child) - child image is an identical copy of parent - parent and child do not share address space, have different PCB and PID - files open in parent remain open in child - in parent, fork returns with child_pid value - in child, fork returns with 0 #include #include #include pid_t childpid; childpid = fork(); switch (childpid) { case -1: fprintf (stderr, "error: %s\n", sys_errlist[errno]); exit(1); break; case 0: /* child does its thing */ break; default: /* parent does its thing */ break; } - more portable (to non-UNIX platforms) error reporting than sys_errlist perror(): - child can determine its pid and parent's pid using getpid(2): getppid(2): - parent must often coordinate its activity with children (e.g. exchanging messages) - simplest coordination is to synchronize with children's termination: exit(3): void exit(int status); /* returns lower 8 bits of status to waiting parent */ /* convention is to exit on 0 if correct termination */ wait(2): pid_t wait (int *status); /* allows parent to sleep until one of children exits */ /* returns exit status of that child */ - two additional functions of interest: atexit(3): int atexit(void (*func)(void)); /* called on normal termination */ waitpid(2): #include #include pid_t waitpid (pid_t pid, int *status, int options); /* POSIX extension: allows parent to specify child */ /* pid == -1: waits for all (any) children */ /* pid > 0: waits for specified child */ /* options: WNOHANG (only relevant val) = don't block */ - whole point of forking a child is generally to execute a different task - therefore, need to replace the original process's image with a new one - stack is also replaced so no return is possible (except if OS fails to exec) Executing of programs inside a process in UNIX: exec(2): execl, execv, execle, execve, execlp, execle - new program executed inside process that calls exec - same PID, open file descriptors, etc. - simple form: int execl (const char *path, const char *arg0, ... [(char *)0]); - example: want to rename grades file as a backup (% mv grades grades.bak) - use child to do the dirty work: if (fork() == 0) { execl ("/bin/mv", "mv", "grades", "grades.bak", (char *)0); perror ("execl"); /* if we're here, execl failed */ exit(1); } - simpler way to do this is to use the path-aware version: int execlp (const char filename, const char *arg0, ... [(char *)0]); e.g. execlp ("mv", "mv", "grades", "grades.bak", (char *)0); - the path-aware form offers advantage - looks for program in all directories specified by PATH variable - if found but not a binary executable (based on magic-number) assumes it's shell script so fork-execs /bin/sh and feeds it arg list Aside -- Shell scripts: - default interpreter is Bourne shell - can specify another interpreter if first line of file contains: #! - interpreter is executed with argv[] - examples: #!/bin/sh echo foo #!/bin/perl print "Hello\n"; - can also use environment-aware form: int execle (const char *path, const char *arg0, ... [(char *)0, char *envp[]]); extern char **environ; char *getenv(const char *name); - environment strings look like: HOME=/users/jer PATH=/bin:/usr/bin HOSTNAME=vivaldi ... - what if argument list (length of) not known at compile time? - e.g. need to move a series of grades files into backup directory % mv grades.jan grades.feb grades.march backup/ - could do it like this: for (i = 1; i < argc; i++) execl ("/bin/mv", "mv", argv[i], argv[argc]); - why not? [ exec won't return! need to fork-exec ] - obviously very inefficient - also won't work if we need to specify extra parameters (e.g. recursive flag) - solution: 3 more exec flavours: int execv (const char *path, char *argv[]); int execvp (const char *filename, char *argv[]); int execve (const char *path, char *argv[], char *envp[]); -> Note: limits on arg list size (4096 bytes in POSIX)