PHP and MYSQL: Recursion

Written by Evan Petersen on Sunday, 13 November 2011. Posted in Programming

Creating a threaded comment platform

In this tutorial, we’ll be creating a system that can handle threaded comments.  What do I mean by threaded comments? Well imagine this scenario…

INTRODUCTION

Bob submits a comment asking others what the answer to question #2 in his math homework is

               Sally responds to Bob’s comment saying that she thinks the answer is 5

                              Joe responds to Sally’s comment asking how she got 5

               John responds to Bob’s comment with the answer 7

Threaded comments allows for a response to any comment, not just the most recent.  This is a popular model when discussing complicated issues where there are multiple parts to a problem.  So the question is, how do we store information in a database in a meaningful way so that we can later retrieve the information and display it in the format above?

 

RECURSION

For those of you that just shuddered at the mentioning of the word, just hold on for a moment and you’ll quickly realize that it’s not that bad.  Recursion can be used to formulate elegant solutions to complicated problems.

For those of you that don’t know what recursion is, picture the following:

The mathematical operator (!) is what we refer to as a factorial.  If one were to ask what the answer to 4!, we could arrive at the solution by performing the following calculations:

4*3*2*1

Each component of the calculation above is dependent upon the previous number.  If the original problem was 6!, then we would know to multiply 6*5*4*3*2*1.  In this case, we multiply a given number by that number -1 (subtract 1), and then that number multiplied by itself -1.

The process that we just went through is generally known as “identifying a relationship”.  In order to create a recursive function, we must be able to break a large problem (the factorial) into a series of smaller ones.

The second step to the problem is referred to as the “base case”.  Our recursive function needs to know when it has fulfilled its duty and no longer needs to break the problem into smaller pieces.  If we look at our example, we know that the problem has been broken up into the simplest part when we multiply by 1.

 

TRANSLATING ENGLISH TO CODE

Here is the PHP  code that would drive the factorial function explained above:

function factorial($n) {
   If (n == 1) {
      Return 1;
   } else {
      return $n*factorial($n-1);
   }
}

 

Line 2 is our base case and line 4 is our relationship.

 

CREATING THE TABLE STRUCTURE

If we have a given comment of id 1, and two comments of ids 2 and 3 that directly correspond to comment 1, how would we best represent that in a database?

1.  First Comment (by user 1)

               2. Second Comment (by user 2)

               3. Third Comment (by user 3)

To solve this problem, I created a self-referential table.

Table: `ticketText`

ID

userID

referenceID

message

timestamp

1

1

First message

11/13/11 2:30

2

2

1

Second Message

11/13/11 2:35

3

3

1

Third Message

11/13/11 2:37

For our purposes, we won’t worry about creating a table of users (which the userID column would point to), but know that it does exist.  Additionally, I have included a timestamp function so that it would be easy to see when a particular comment was made rather than just in relation to other comments.

Think of the referenceID column as a pointer to another record’s ID column.  It points to the message that the comment should be attached to.  This way we can create a many to one relationship, providing for the building ground of a threaded comment system.

 

APPLYING THE RECURSIVE METHOD

Storing comments in a threaded system is the easiest part to this system.  Retrieval and representation of the data is where the magic comes in. 

Similar to that of our example with factorials, we need to identify the relationship and the base case.  The relationship is directly defined by our table structure; are there any records pointing to the record that I am looking at.  Similarly, the base case is met when there are no records pointing to the record that I am looking at.

 

THE CODE

<?php
 
$db = mysql_connect("$dbHost", "$dbUser", "$dbPass") or die ("Error connecting to database.");
 
mysql_select_db("$dbDatabase", $db) or die ("Couldn't select the database.");
 
function getComments($ID) {
 
   $ticket = mysql_query("SELECT * FROM ticketText WHERE `ID` ='$ID'");
 
   while ($ticketRow = mysql_fetch_array($ticket)) {
 
      echo '<p style="">'.$ticketRow['text'].'</p>';
 
   }
   
   $thread = mysql_query("SELECT * FROM ticketText WHERE `referenceID` ='$ID'");
   
   if (mysql_num_rows($thread)>0) {
 
      echo '<div style="margin-left:10px; border-left: 1px black dotted; ">';
 
      while ($nextTicket = mysql_fetch_array($thread)) {
 
         getComments($nextTicket['ID']);
 
      }
   
      echo '</div>';
 
   }
 
}
 
getComments(1);
 
?>

 

CODE EXPLAINED

The only tricky part (other than the elephant in the room that is recursion) is the many to one relationship.  To identify ALL tickets that reference the ticket at hand, we have to run through a foreach loop with a MYSQL call that checks for all tickets that reference our ticket.  We then recursively call our getComments method for every ticket that points to the ticket at hand.

 

THE OUTPUT

first comment

second comment

another test

wahoo!

woo!

complicated

deep thread

wooo!

this is getting longer!

blah

even longer!

so complicated

super deep thread

third comment

blah

very very complicated

so much text



TAKE AWAY

I sincerely hope that you won’t be afraid to approach recursion when presented with a problem of this magnitude.  This is only one of many practical cases in which recursion is one of the better solutions.  Don’t use recursion for problems that can be easily defined iteratively; your solution will always have a higher cost if defined using recursion.

Due to the complex nature of the topic, please feel free to post questions relating to the example above or that of recursion in general.  I’m here to help!

Regards,

Evan Petersen

About the Author

Evan Petersen

My name is Evan Petersen, and I work as a Programmer in Southern Oregon. You can visit the home page of my blog at: www.EvanPetersen.com.

I enjoy reserarching new methods to solve age old problems and later sharing my findings with the community at large. Hopefully you'll find something of use!

Follow me on G+

Comments (15)

  • mario

    mario

    14 November 2011 at 19:34 |
    Let me put it as polite as I can. The internet really doesn't need ANOTHER tutorial that tutores the outdated mysql_ functions, and then AGAIN without any mention of proper escaping.
    • Evan Petersen

      Evan Petersen

      14 November 2011 at 21:50 |
      This blog serves as a training ground for those that are new to the world of programming. I focus more on the concepts at hand and understanding what is going on in the script.

      Additionally, there is no benefit to using the newer MSYQL functions. I liken it to that of object oriented programming; it can come in handy, but don't just use it because it's new and exciting.

      You're totally right, I didn't mention anything about escaping. That would probably be because we never allowed the client to insert anything into the database. If you are referring to HTML escaping, then yes, you are completely right. Ultimately, everything should be stripped out before it hits the database, so I'm not sure as to why you are concerned with proper escaping when pulling from the database.

      Regards,
      Evan Petersen
      • Andy

        Andy

        15 November 2011 at 04:21 |
        "You're totally right, I didn't mention anything about escaping. That would probably be because we never allowed the client to insert anything into the database."

        Wow. Just wow.

        If you do not know the dangers of exposing your SQL to the outside without proper sanitation, then you do not deserve to be called a web developer, nor should you be writing tutorials.

        You could do all sorts of things, even if it is a SELECT query. You *can* inject stuff into the database, and you could also retrieve sensitive information, such as user credentials.

        If you intend to teach others without being properly educated yourself, then all you'll do is hurt the web further.
        • Evan Petersen

          Evan Petersen

          15 November 2011 at 08:21 |
          As you can see in line 35, the first comment call is hard coded. The system that we built today provides no interface to the end user. All comments listed in the table were created by us. The intention of the tutorial was to give a demonstration of a time that recursion can be useful and how to implement it in an effective manner.

          I would also like to take a moment and talk about user privileges. I would hope that someone submitting select queries would create a separate database user with nothing more than select privileges. Giving a user more than what they need is far from best practice; it's downright foolish.

          Does that make sense?
        • Ryan

          Ryan

          15 November 2011 at 13:31 |
          Dear Andy and mario... I think you're both missing the point of these tutorials. As I understood it, this site is to learn how to program. Obviously you too have been around the block and already know what's what, so why are you wasting your time here? I am new to this and just want to learn the basics... Don't make it so complicated that I can't begin to understand the underlying code, let alone the security issues.
  • daGrevis

    daGrevis

    15 November 2011 at 02:59 |
    Wait, wait, wait...

    Each time recursive function is called, two queries are executed, right? And it will be called for each 'level' in comments, right? Insanity. :D
    • Evan Petersen

      Evan Petersen

      15 November 2011 at 08:22 |
      You are correct! Two queries will be called each time. The second query is a search to see if there are any comments pointing to the comment that I am currently looking at. Then we loop through each one of those, calling the getComments method for each one.
      • daGrevis

        daGrevis

        15 November 2011 at 09:57 |
        You know that it's very bad from performance too, right? :)
        • Evan Petersen

          Evan Petersen

          15 November 2011 at 13:27 |
          I wouldn't go so far as to say that it's really bad for performance. After implementing in a test atmosphere, with about 30 comments and 5 layers of depth, it took about .003 seconds to run. As I mentioned before, recursive definitions will always be slower than their iterative counterpart.

          :)
          • daGrevis

            daGrevis

            16 November 2011 at 00:55 |
            I'm not worried about recursion as it is. I'm worried about count of queries made. I believe that there is better way (with better I think - with less queries) to accomplish the same effect.

            What do you think? :)

            P.S. Anyway, good tutorial.
        • Evan Petersen

          Evan Petersen

          11 December 2011 at 21:05 |
          You are quite welcome!

          And as a side note, I just want to say that I am a huge Android fan and have developed a few Apps in my time (unfortunately they are private and are only to be used inside the company). Keep up the good work!
      • Evan Petersen

        Evan Petersen

        11 December 2011 at 21:03 |
        Much appreciated! Be sure to take a look at my latest post which gives a brief overview of the PDO alternative to this article!
  • financial management

    financial management

    07 December 2011 at 12:51 |
    I look forward to reading more of your articles and posts in the future, so I've bookmarked your blog. When I see good quality content, I like to share it with others. So I've created a backlink to your site. Thank you!…
  • Rosalinda

    Rosalinda

    07 January 2012 at 04:43 |
    Hey, klielr job on that one you guys!
  • Evan Petersen

    Evan Petersen

    17 November 2011 at 21:47 |
    The one idea that I had for limiting the number of queries is to add a GROUP_CONCAT() to the first query which would consist of a join against the comment table for any tickets that reference the ticket at hand. The only problem is that you are still dealing with a sub-query. What do you think?

    Also, thanks for the Kudos!

Leave a comment

You are commenting as guest. Optional login below.