PHP NOTES: Benchmarking heredoc notation

[Shaun's PHP Scripts > Notes > Benchmarking heredoc notation]

This note describes benchmarks of multiple echo statements against single heredoc statements.


Starting with version 4, PHP provides "heredoc" notation, which is a method of printing or echoing large amounts of text at once. Heredoc notation can save tons of keystrokes, and lets you avoid the hassle of escaping certain characters. It's simple to use:

<?

$vars = "variables";

echo <<<EOT
This text will be echoed out,
so will this text, plus you can "use quotation marks"
without escaping them! You can also include $vars
directly in your output.
EOT;

#EOT stands for "End of Text," and is a commonly used heredoc
#delimiter. If the string "EOT" actually appears somewhere in your
#text, you'll receive errors and not all of your text will be
#printed. You can use any delimiter you like, some programmers
#prefer to use "QQQ" since it's unlikely to show up in real text.

?>

But I've often wondered: is heredoc notation faster than using multiple echo statements?

Many people, myself included, write their scripts with readability in mind. For me, that means I try to keep each line of a script within the standard 80-column viewable real estate of a unix text editor. In my older scripts this resulted in a lot of extra calls to "echo" to prevent long lines from running off into the margin. If I wanted to print out a 120-character blob of text, I'd break it into two smaller echo statements so that the script was still easy to read at the terminal.

Somewhere along the line, it hit me that perhaps this wasn't such a good idea; and that all those extra echo statements might be hindering the performance of my scripts. I decided to test my hunch by doing a couple of simple benchmarks. First, I created bench1.php, which uses heredoc notation to echo a line of text with a few variables. In order to generate a measurable execution time, the line of text is echoed 10,000 times:

<?
#bench1.php

$string = "Test Data";
$number = 412;

for($i=0; $i<10000; $i++){

echo <<<EOT
The string is $string and the number is $number.\n
EOT;

}
?>

Then I created bench2.php, which did the exact same thing but used two echo statements instead of heredoc notation:

<?
#bench2.php

$string = "Test Data";
$number = 412;

for($i=0; $i<10000; $i++){

echo "The string is $string ";
echo "and the number is $number.\n";

}
?>

With the help of the time command, I found that the heredoc notation was indeed slightly faster than using two echo statements. I only ran the test twice, but was satisfied that the results were consistent:

[shaun@agaliarept test]$ time php bench1.php > outfile

real 0m7.908s
user 0m0.000s
sys 0m7.883s
[shaun@agaliarept test]$ time php bench2.php > outfile

real 0m11.133s
user 0m0.000s
sys 0m11.095s
[shaun@agaliarept test]$ time php bench1.php > outfile

real 0m7.865s
user 0m0.000s
sys 0m7.842s
[shaun@agaliarept test]$ time php bench2.php > outfile

real 0m11.005s
user 0m0.000s
sys 0m10.970s

Both times, the heredoc version of the script took about 8 seconds to finish, while the script with two echo statements ran for about 11 seconds. Convinced that I was onto something, I decided to create a more accurate test environment. I replaced bench1.php and bench2.php with new scripts which echoed out a short HTML page. As is frequently done in real-world scripting, certain attributes of the HTML - such as the background color, text color, and font face - were stored in variables.

<?
#The new bench1.php

$bgcolor = "#FFFFFF"; $textcolor = "#000000"; $link = "#0000FF";
$fontface = "Verdana,Arial"; $fontsize = 2;

for($i=0; $i<10000; $i++){

echo <<<EOT
<html><head><title>Benchmark Test Page</title></head><body bgcolor=
"$bgcolor" text="$textcolor" link="$link"><p><font face="$fontface"
size="$fontsize"> Here is some test text for the web page.</font>
</p></body></html>
EOT;

}

?>


<?
#The new bench2.php

$bgcolor = "#FFFFFF"; $textcolor = "#000000"; $link = "#0000FF";
$fontface = "Verdana,Arial"; $fontsize = 2;

for($i=0; $i<10000; $i++){

echo "<html><head><title>Benchmark Test Page</title></head>";
echo "<body bgcolor=\"$bgcolor\" text=\"$textcolor\"";
echo " link=\"$link\"><p><font face=\"$fontface\" size=\"$fontsize\">";
echo "Here is some test text for the web page.</font></p></body></html>";

}

?>

When the benchmark was run on the new scripts, heredoc notation clearly won again:

[shaun@agaliarept test]$ time php bench1.php > outfile

real 0m24.587s
user 0m0.000s
sys 0m14.103s
[shaun@agaliarept test]$ time php bench2.php > outfile

real 0m36.327s
user 0m0.000s
sys 0m19.580s
[shaun@agaliarept test]$ time php bench1.php > outfile

real 0m24.638s
user 0m0.000s
sys 0m14.241s

[shaun@agaliarept test]$ time php bench2.php > outfile

real 0m34.035s
user 0m0.000s
sys 0m19.542s

Of course a real benchmark would have run these tests hundreds of times apiece instead of just twice, so this data is hardly scientific. It's also worth noting that your average script doesn't do something 10,000 times in a row, so the benefit of CPU time gained is going to be negligible in most cases. The experiment was enough to make me a believer in heredoc notation, though; I won't be using multiple echo statements in the future.




Copyright 1999-2008 Shaun - Use this form to contact me.
This page looks best in every browser. That's why it doesn't have lots of bells and whistles.