Article
PHP on the Command Line - Part 2
Working with the Environment
No PHP script is an island, particularly when it's used to execute other programs. A Unix shell provides its own "memory" in which you can store environment variables. A number of variables are predefined and some, such as the PATH variable, are essential to allow your PHP script to execute external programs "naively".
The excellent LINUX: Rute User's Tutorial and Exposition provides a summary of common environment variables here.
When you execute a PHP script from the command line, it inherits the environment variables defined in your shell. That means you can set an environment variable using the export command like so:
$ export SOMEVAR='Hello World!'
Note we use single quotes -- not double quotes!
Now, let's execute the following PHP script:
#!/usr/local/bin/php
<?php
fwrite(STDOUT,"The value of SOMEVAR is ".getenv('SOMEVAR')."\n");
exit(0);
?>
Filename: env1.php
You will see the value you assigned to SOMEVAR. The getenv() function allows you to read environment variables.
You can also create or modify environment variables from a PHP script using the putenv() function, but with one important caveat: you can't modify the parent processes environment, only the local copy to which your script has access.
Continuing from the last example, let's execute the following PHP script:
#!/usr/local/bin/php
<?php
fwrite(STDOUT,"The value of SOMEVAR is ".getenv('SOMEVAR')."\n");
putenv("SOMEVAR='Goodbye World!'");
fwrite(STDOUT,"The value of SOMEVAR is ".getenv('SOMEVAR')."\n");
exit(0);
?>
Filename: env2.php
On executing this code, we'll see the modified value of SOMEVAR on the second call to fwrite(). But, if I type the following from my shell once the script finishes execution, I see the original value "Hello World!"
$ echo $SOMEVAR
As first glance you might think that makes putenv() fairly useless. Where it becomes important is when your PHP script needs to set up the environment for another program that it will be executing. Look at this example:
#!/usr/local/bin/php
<?php
putenv("SOMEVAR='Goodbye World!'");
fwrite(STDOUT,"Executing env1.php\n");
fwrite(STDOUT,shell_exec('./env1.php'));
exit(0);
?>
Filename: env3.php
Because the script env1.php inherits the environment from its parent process, env1.php, it sees the modified value of SOMEVAR and displays "Goodbye World!"
In other words, environment variables can be used to communicate information between programs. You might define an environment variable INSTALL_DIR, which points to a directory into which multiple scripts should install something. Some main script, perhaps install.php, sets this variable, then executes other scripts that perform further installation tasks, such as extracting source code archives into the installation directory, or creating API documentation with phpDocumentor.
It's also important to be aware of environment variables when executing external programs from PHP scripts running under Apache. Typically, the scripts will run with the environment of a special Unix user ("nobody", "wwwrun" or similar), with very few filesystem privileges and perhaps a non-standard user environment.
This may mean that, for example, when you attempt to execute under Apache a program with which you'd normally find your own PATH, nothing visible happens. It may be that you just need to specify the full path to the program or update the PATH variable temporarily using putenv(). It could also be a permissions issue, which is a different story.
Note that if you control the server and need to execute external programs from PHP scripts running under Apache, it may be worth investigating sudo, which provides a mechanism through which one user can execute a single command with the same environment and permissions as another user (obviously, you need to be extremely cautious when doing this).
Process Control
Earlier, we saw the popen() function, which allows us either to read or write to an external program. Sometimes, you need to do both, which is where the proc_open() function comes in handy.
Here's a simple program that reads input line by line from STDIN and converts the first letters of each word to upper-case, then writes the line back to STDOUT. Because it's a somewhat temperamental program, it doesn't like to get words in all upper-case. If it finds an upper-case word, it complains to STDERR and removes that word:
#!/usr/local/bin/php
<?php
while ( ($line = trim(fgets(STDIN))) != 'exit' ) {
$words = explode(' ',$line);
foreach ( $words as $index => $word ) {
if ( strcmp($word,STRTOUPPER($word)) != 0 ) {
$words[$index] = ucfirst($word);
} else {
fwrite(STDERR, "Bleugh! $word is all upper case!\n");
unset($words[$index]);
}
}
$line = implode(' ',$words);
fwrite(STDOUT, $line."\n");
}
exit(0);
?>
Filename: ucfirst.php
Being able to talk to this program from another PHP script is obviously trickier. We need to connect to STDIN, STDOUT and STDERR at the same time. Enter: proc_open…
#!/usr/local/bin/php
<?php
// Describes how proc_open should open the external program
$Spec = array (
0 => array('pipe','r'), /* STDIN */
1 => array('pipe','w'), /* STDOUT */
2 => array('file','./badwords.log','a'), /* STDERR */
);
First, I need to define a "descriptor spec", which is an array with a special structure. It tells proc_open() how to connect to the STDIN, STDOUT and STDERR of the external program. It should be an array of arrays, the parent array containing three elements that correspond to STDIN, STDOUT and STDERR respectively. Each of the child arrays specifies how each of these should be used, identifying a "resource type": either a "pipe" for shell IO, or a "file", and further parameters that describe how the pipe or file should be used. This may seem a little "clunky" at first, but you'll get used to it.
Here, I've defined a descriptor spec that says, "make STDIN available to the external program for reading (the spec is seen from the perspective of the external program, even though this script will be writing to the pipe), make STDOUT available, so the external program can write output to it, then point STDERR to the file 'badwords.log' and append any error messages to it."
Let's move on to the rest of the program, where the purpose of the "descriptor spec" should become clearer:
// Handles to external programs STDIN, STDOUT and STDERR placed here
$handles = array();
// Open the process using the descriptor spec
$process = proc_open('./ucfirst.php', $Spec, $handles);
// Some lines to input to ucfirst.php
$lines = array (
'hello world!',
'a bad EXAMPLE of upper case.',
'goodbye world!',
);
foreach ( $lines as $line ) {
// Write the line to the STDIN of ucfirst.php
fwrite($handles[0],$line."\n");
$response = fgets($handles[1]);
// Display the response to the user
fwrite(STDOUT, $response);
}
// Issue the exit command to the ucfirst.php program
fwrite($handles[0],"exit\n");
// Clean up
fclose($handles[0]);
fclose($handles[1]);
proc_close($process);
exit(0);
?>
Filename: proc_open.php
There are a few things to notice here. Executing proc_open() returns a resource that represents the process, but to interact with the process you need to read from or write to handles stored in the $handles array, the elements of the array corresponding to those in the descriptor spec array.
Note that you need to be particularly careful to close handles to pipes, as I did in the clean up section above. A smarter approach might be to register a shutdown function to take care of cleaning up -- see register_shutdown_function().
Now, when we execute proc_open.php, the output looks like that shown below:
Hello World!
A Bad Of Upper Case.
Goodbye World!
The badwords.log file contains:
Bleugh! EXAMPLE is all upper case!
With PHP5, proc_open() supports a new descriptor, "pty", which will allow it to control programs that read from other locations than STDIN and STDOUT (this should solve the problem with sending password to mysqldump, for example).
PHP5 also introduces three more functions: proc_get_status(), which provides information about what the external program is currently doing (most importantly, providing its process ID), proc_terminate(), which allows you to send the process a termination signal (more on that below) and proc_nice(), which allows you change the priority the operating system assigns to the program (how much processor time it gets); normally this is a number between 20 (lowest priority) and -20 (highest priority). Note that proc_nice() is actually not related to proc_open(); it's meant to control the priority of the current PHP script, not the priority of an external program. From the shell, the renice command can be used to the same effect.
Signals
The Posix functions provide various tools to get useful information from the operating system (note the Posix extension is not available on Windows). Perhaps most useful, from the point of view of working with external programs, is the posix_kill() function, which allows you to send signals to a program while it's executing.
If you've used Unix more than a little, you've probably run into the kill command, perhaps in its most famous incarnation:
$ kill -9 <pid>
The "-9" says send the "SIGKILL" signal (exit immediately) to the process identified by its ID. The Rute Users Tutorial provides some common signals here (in fact, it's worth reading the whole chapter on Processes and Environment Variables).
The posix_kill() function works in the same way as the kill command: you identify a process by its process ID and send it a signal. Normally, you should send signal number 15 (SIGTERM), which gives the program you're killing a chance to clean up before it exits. The SIGKILL signal (9) cannot be intercepted by a program; it dies immediately, which may mean it leaves a mess behind.
For more information on signals, it's also worth glancing at PHP's Process Control Extension (disabled by default), which provides functions to allow your scripts to work with signals (among other things).
Warning: do not use the pcntl functions when running PHP under Apache! Weird things will happen if you do.
Here's a script that "catches" the SIGTERM signal:
#!/usr/local/bin/php
<?php
// Required from PHP 4.3.0+: see manual
declare(ticks = 1);
function cleanUp() {
fwrite(STDOUT,"Performing clean up...\n");
exit(0);
}
// Map the SIGTERM signal to the cleanup callback function
pcntl_signal(SIGTERM, "cleanUp");
// Illegal - you cant catch the KILL signal
# pcntl_signal(SIGKILL, "cleanUp");
while (1) {
// Loop forever
}
?>
Filename: pcntl1.php
I execute this from the command line like so:
$ ./pcntl1.php &
This instructs the shell to run the process in the background. Immediately after execution, it tells me the process ID under which my script is running, for example:
[6] 2840
The "2840" being the process ID. I now type the following:
$ kill 2840
And I see the below:
Performing clean up...
The process then exits (note the kill command defaults to desired SIGTERM signal, in case you were wondering).
You may find it useful to catch signals like SIGINT, which corresponds to a user pressing CTRL+C to quit your program; this would give you to opportunity to roll back anything your script has done.
More PCNTL Tricks
One further function of note is pcntl_exec(). Unlike any of the program execution functions we've seen so far, the use of pcntl_exec() to execute an external program causes the program to replace the script by which it was executed, instead of running as a child process.
For example, imagine I execute the following script:
#!/usr/local/bin/php
<?php
fwrite(STDOUT,"My process ID is ".posix_getpid()."\n");
pcntl_exec('./longweight.php',array('-l15','-uharryf','-psecret'));
?>
Filename: pcntl2.php
I then execute the code below from a separate shell:
$ ps -ef | grep longweight.php
The process ID of the longweight.php is the same as the process ID reported from pcntl2.php. What's more, if I execute the following, I see nothing:
$ ps -ef | grep pcntl2.php
The reason I see nothing it because the script was replaced by longweight.php.
This behaviour can be useful when writing "wrapper" scripts that set up an environment for another script, then quietly disappear.
Another area in which the PCNTL functions are useful is if you're writing some kind of "daemon" process with PHP (a server). Believe it or not, there are a number of HTTP servers written entirely in PHP, such as Nanoweb and PEAR::HTTP_Server.
Using the pcntl_fork() function, you can write a script that acts as a server but creates child processes to handle incoming requests in much that same way as Apache Web server (1.x). This allows my server to cope with multiple tasks in parallel, reducing delays. Unfortunately, pxntl_fork() doesn't lend itself to a short discussion, so I'll leave it for you to investigate further, should you need it.
Over to You
You should now have a pretty good idea of how to get your command line PHP scripts to "tune in" to the wealth of tools and utilities available on Unix-based systems. There's a lot to learn if you're new to Unix, and in this short article I've only been able to introduce the topics. Hopefully, though, you have a feeling for how much power is at your disposal.
If you're interested in finding out more, I highly recommend Paul Sheer's Rute User's Tutorial and Exposition, also available from Amazon.