Intro to Processes
Components:
- object program instructions (T) (program text in UNIX)
- program data (D) (static variables)
- stack (S) (dynamic variables, function params, etc.)
- execution context of program:
- CPU registers
- memory usage (valid addresses)
-- what happens if process tries to go beyond?
- resource usage (eg. files, devices)
- priority, status (blocked, running, etc.)
- OS maintains descriptor (process control block) for each process
- descriptor created when process created
- descriptor destroyed when process terminated (sort of)
|---------------------------------|
v proc B v-----------| | process descriptor table (PCBs)
----------------------------------|---|---------------
| S | D | T | |S|D|T| | | | | | | | | MAIN MEMORY
--------------------------------------^-----------|---
^ ^ ^ proc A |-----------| current process pointer
| | |
----------------
|H P L | Proccess registers
|i C o |
----------------
Process Creation: Theory
- primitive operations postulated by Conway: fork, join, quit
- forked process shares same address space (unlike real systems, eg. UNIX)
- quit terminates a process
- join(arg) synchronizes processes:
if (--arg != 0) quit
Classical Interprocess Communication
Parent process:
Child process:
A;
fork(L); L: C
B; join(V);
await-join(V);
D;
- fork gives control to L in child and next statement in parent
- in UNIX:
A;
if (fork() == 0) {
B; /* child */
exit(); /* terminate child, signal parent */
} else {
C; /* parent */
wait(); /* await signal from child */
}
D;
- synchronization example:
process A() { process B() {
while (1) { while (1) {
compute A1; read x;
write x; compute B1;
compute A2; write y;
read y; compute B2;
} }
} }
UNIX process-related system calls
use ps command (ps -aux) to see list of processes in system
Process Creation in UNIX:
fork(2): #include
pid_t fork(void);
- creates new process (child)
- child image is an identical copy of parent
- parent and child do not share address space, have different PCB and PID
- files open in parent remain open in child
- in parent, fork returns with child_pid value
- in child, fork returns with 0
#include
#include
#include
pid_t childpid;
childpid = fork();
switch (childpid) {
case -1:
fprintf (stderr, "error: %s\n", sys_errlist[errno]);
exit(1);
break;
case 0:
/* child does its thing */
break;
default:
/* parent does its thing */
break;
}
- more portable (to non-UNIX platforms) error reporting than sys_errlist
perror():
- child can determine its pid and parent's pid using
getpid(2):
getppid(2):
- parent must often coordinate its activity with children
(e.g. exchanging messages)
- simplest coordination is to synchronize with children's termination:
exit(3): void exit(int status);
/* returns lower 8 bits of status to waiting parent */
/* convention is to exit on 0 if correct termination */
wait(2): pid_t wait (int *status);
/* allows parent to sleep until one of children exits */
/* returns exit status of that child */
- two additional functions of interest:
atexit(3): int atexit(void (*func)(void));
/* called on normal termination */
waitpid(2): #include
#include
pid_t waitpid (pid_t pid, int *status, int options);
/* POSIX extension: allows parent to specify child */
/* pid == -1: waits for all (any) children */
/* pid > 0: waits for specified child */
/* options: WNOHANG (only relevant val) = don't block */
- whole point of forking a child is generally to execute a different task
- therefore, need to replace the original process's image with a new one
- stack is also replaced so no return is possible (except if OS fails to exec)
Executing of programs inside a process in UNIX:
exec(2): execl, execv, execle, execve, execlp, execle
- new program executed inside process that calls exec
- same PID, open file descriptors, etc.
- simple form:
int execl (const char *path, const char *arg0, ... [(char *)0]);
- example: want to rename grades file as a backup (% mv grades grades.bak)
- use child to do the dirty work:
if (fork() == 0) {
execl ("/bin/mv", "mv", "grades", "grades.bak", (char *)0);
perror ("execl"); /* if we're here, execl failed */
exit(1);
}
- simpler way to do this is to use the path-aware version:
int execlp (const char filename, const char *arg0, ... [(char *)0]);
e.g. execlp ("mv", "mv", "grades", "grades.bak", (char *)0);
- the path-aware form offers advantage
- looks for program in all directories specified by PATH variable
- if found but not a binary executable (based on magic-number)
assumes it's shell script so fork-execs /bin/sh and feeds it arg list
Aside -- Shell scripts:
- default interpreter is Bourne shell
- can specify another interpreter if first line of file contains:
#!
- interpreter is executed with argv[]
- examples:
#!/bin/sh
echo foo
#!/bin/perl
print "Hello\n";
- can also use environment-aware form:
int execle (const char *path, const char *arg0, ...
[(char *)0, char *envp[]]);
extern char **environ;
char *getenv(const char *name);
- environment strings look like:
HOME=/users/jer
PATH=/bin:/usr/bin
HOSTNAME=vivaldi
...
- what if argument list (length of) not known at compile time?
- e.g. need to move a series of grades files into backup directory
% mv grades.jan grades.feb grades.march backup/
- could do it like this:
for (i = 1; i < argc; i++)
execl ("/bin/mv", "mv", argv[i], argv[argc]);
- why not? [ exec won't return! need to fork-exec ]
- obviously very inefficient
- also won't work if we need to specify extra parameters (e.g. recursive flag)
- solution: 3 more exec flavours:
int execv (const char *path, char *argv[]);
int execvp (const char *filename, char *argv[]);
int execve (const char *path, char *argv[], char *envp[]);
-> Note: limits on arg list size (4096 bytes in POSIX)