Whenever you fork, you spawn off a set of new concurrent child processes from a parent process. The difference between the join, join_any, and join_none statements is in what the parent process does after the children are spawned off.
join - parent process blocks (waits) until all child processes complete and return, then it continues to the next statement after the join
join_any - parent process blocks (waits) until any child completes (i.e. the first to complete), then it continues to the next statement after the join_any
join_none - parent process does not block. Execution continues to the next statement after the join_none. Child processes are scheduled to start, but do not start until the parent encounters a blocking statement (#, @, wait)
join_none is useful when you want to start off a bunch of on-going processes, such as monitor processes that independently watch for interesting activity, or when you want to start up several independent stimulus generation processes. In both cases, you don't really care about when or even if the child processes finish.
join_any is useful when you want to start a child process, but you want to have a "timeout" so that execution in the parent can continue after either the child finished, or the timeout time has expired, whichever comes first:
fork
start_child_task();
#10000;
join_any
Here, two processes are spawned off, running concurrently. One process executes a task, which takes some unknown amount of time to complete. The second process just delays for 10000 time units, then completes. The parent process will continue execution after whichever child completes first. If desired, you can add a "disable fork" statement after the join_any so that the parent will kill the remaining process.
fork / join_any is useful for watchdog tasks. For example, task #1 is responsible for tracking the total numbers of clocks a test may execute, while task #2 checks for abnormally long idle periods on an interface of interest. If a test hangs with no bus activity, then task #2 completes and finishes the test. On the other hand, if the test is running fine, but is just taking way too many clocks to finish because of a DV oversight, then task #1 will complete and terminate the test. If neither task completes, then the watchdog process is blocked from completing.
fork/join_none is useful if multiple processes are providing stimulus. The testbench can start multiple tasks for each interface, and not worry about when the tasks are complete.